[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Why would anyone want opacity?



Alan Bawden asked some good questions.

> Exactly what is it that you are trying to defend against?  What is
> programmer A trying to hide from programmer B?  Is this intended to be a
> method that can be used to protect trade secrets embodied in Scheme
> programs that you ship to customers?

This is one possible application.  Consider a commercial program
that, in an attempt to discourage software piracy, displays the
name of its licensed user upon startup.  This name is probably
encrypted somewhere within the application.  Abstraction barriers
make it harder to find the encrypted name and to change it.  They
probably don't make it impossible, because most of the widely used
operating systems allow the bits of an application to be browsed.

> ...Or to protect programmers working on
> the same project from each other somehow?

I think this is the most important thing.  Imagine a two-person
software project in which one person is not terribly experienced.
For concreteness, imagine a term project in a compiler construction
course in which part of the compiler's code is supplied by the
instructor (me).  The data types that I supply (such as abstract
syntax trees) are likely to change during the term as I extend the
source language and add new attributes for type checking, optimization,
and code generation.

I can and do tell students to respect the abstractness of an abstract
data type.  Sometimes a student will scoff at this, thinking that
abstraction barriers are for wimps, but most students actually do
try to write representation-independent code.  Bad habits are hard
to break, however, so several students are likely to find that their
compilers break when I change the representations.  It's an educational
experience for them.

It also provides an opportunity for them to see what is wrong with
their implementation language (usually C, C++, or Scheme).  These
languages just don't provide enough of an abstraction barrier to
prevent representation-dependent code from being written by accident.
That's the fault of the language, not the fault of the programmer.

This is not just an academic concern.  In the real world, most
programmers are less than outstanding, and many are a great deal
less than outstanding.  In the real world, these imperfect
programmers work on much larger systems that evolve over much
longer periods of time than the systems we construct in academia.

We can rationalize Scheme's shortcomings by telling ourselves that
Scheme is a language for winners, not losers, and take pride in the
fact that so few programmers use Scheme, but I think that's a cop-out.

> ...(In that case, does the security
> of the rest of the programming environment they share matter?  Does it
> matter if they have access to each other's source code?)

Yes.  I have found that my students tend to do better if I give
them compiled code without the source.  This also helps to identify
the occasional student who complains that he can't be expected to
use my data types without knowing how they are represented, so that
I can offer remedial tutoring.

I also find that some of my students who use Scheme will use the
read/eval/print loop or an inspector to learn the representations
even if I provide only compiled code.  Sometimes I have to laugh
at the effort some students will expend toward writing poor code.

I'm sure this rarely happens at MIT, but not everyone gets their
degree from MIT.

> ...Or is this just
> to protect a single programmer from himself somehow?

This is less important, but also welcome, particularly when I change
a large Scheme program that I've been working on for years.

One motivation that Alan didn't mention at all is the desire to
generate more efficient code.  If there is any possibility that
a value will escape from some compilation unit and be examined
by some abstraction-breaking facility, then the compiler must
represent the value in some standard format that is understood
by the abstraction-breaking facility.  These standard formats may
be less efficient than the formats that the compiler could use for
values that will never be exposed to the abstraction-breaking
facilities, because they must encode the information that is made
available by those facilities.

For example, Gerry Sussman wrote:

>   In my favorite programming language/environment I also want a means
>   of editing and replacing the executable part of a procedure on the
>   fly, and a means of editing the environment structure it depends
>   upon.  Note that this does not imply that the actual executable
>   representation of the information is constrained by the
>   representation that these editor mechanisms act on -- just that the
>   effect of performing the edits would change the behavior as
>   expected....

Note that this does indeed constrain the executable representation!
If Gerry is to be able to edit the procedure C defined by

    (define c
      (let ((n (read)))
        (lambda () 0)))

so that its code becomes (LAMBDA () (SET! N (+ N 1)) N), then the
compiler won't be able to transform the above definition into the
otherwise equivalent and probably more efficient definition

    (define c
      (begin (read)
             (lambda () 0)))

If this is a local definition, then (under certain circumstances)
the compiler ought to be able to transform a call to C into the
constant 0, which may enable some further worthwhile optimizations.
A compiler that performs these optimizations won't satisfy Gerry.

It appears to me that, in order to satisfy Gerry, a compiler must
treat all variables as if they were (pardon the expression) VOLATILE.
This is a considerable price to pay for an abstraction-breaking
facility that not even Gerry is likely to use very often.

Will