[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why would anyone want opacity?



Let me see if I can explain what Will is worried about. I suspect that many
readers of this list will find all this old hat, but I fear that some may not.
Or it may be that I have not understood the subtleties of the discussion.  But
here goes:

Let's imagine that I, as an instructor, have built a "2-element table" "data
type" by writing

(define (make-table x y)
  (list '() '()))

(define (get-x-value tab)
  (car tab))

(define (get-y-value tab)
  (cadr tab))

(define (set-x-value tab val)
  (set-car! tab val))

(define (set-y-value tab val)
  (set-car! (cdr tab) val))

My students have been given an assignment to do some computational task using
2-element tables.  (For realism, you may substitute "symbol table" for
"2-element table", "write a compiler" for "do some computational task", and
substitute a suitably larger piece of code in the space above).

My students try to debug their code.  They observe that the tables are
implemented as 2-element lists.  Pretty soon they stop writing "get-x-value",
and start writing "car" instead.  Their code starts to work.  They are happy.
They write much, much code.

Then I decide to change the implementation of 2-element tables to the
following:

(define (make-table x y)
  (cons y x))

(define (get-x-value tab)
  (cdr tab))

(define (get-y-value tab)
  (car tab))

(define (set-x-value tab val)
  (set-cdr! tab val))

(define (set-y-value tab val)
  (set-car! tab val))

Their code stops working.  Not only does their code stop working, it stops
working in strange and unpredictable ways.  They are unhappy.

This is, as everyone knows, not an unrealistic example.  Large software
systems in the 60's and 70's were essentially unmodifiable because much of the
code depended on details of the implementations.  Even now, we are struggling
with the Millenium problem:  in a random Cobol program, how do you find all
the places that depend on dates being exactly 2 digits, numerically ordered?

The solution that was arrived at in the 70's was the concept of information
hiding and abstract data types.  The point of this was that it is insufficient
to merely decree, as Jeff suggests that

> students and other inexperienced programmers can stay out of trouble
> just by not using record-{length,ref,set!}.

Among other things, the student programmers (the clients of the ADT) probably
need these operations for their own uses, so they can't simply be made
illegal-- in our case, we probably don't want to make CAR illegal in student
code. 

What we would like is for the language itself to give some support for
creating and respecting abstraction boundaries:

1.  It would be desirable for the language to detect violations of an
abstraction boundary.  This may happen prior to execution, as in CLU, the ML
module system, etc., or at run-time in a dynamically typed language.

Note that I am _not_ presupposing what should happen when such a violation is
detected.  The system might refuse to continue, issue a warning, raise an
exception, check some kind of privileges or passkeys before proceeding, etc.
But a manager (instructor, ADT implementor, etc) should be given some tools to
protect his or her abstractions.

In current Scheme, about the only way to create an even somewhat secure
abstraction boundary is to hide the details in a closure.  This solution may
be too expensive or undesirable for other reasons.

2.  It would be desirable if the language did _not_ sucker the unsuspecting
client into trying to violate the abstraction boundary.  In our example,
Scheme seduced my students into writing "car" because their table objects
printed out like lists, answered #t to the pair? predicate, etc.  

The reason why people want opacity is to give the language the ability to
do these things.  Just how that might be accomplished (in a "schemely" way) is
of course still up for discussion.

I have been bitten by this myself.  I try to write nice modular code, but I
often find that my "client" code is representation-dependent in quite subtle
ways. 

The arguments in favor of strong abstraction resemble in many ways the
arguments in favor of strong static typing, and the objections to strong
abstraction resemble those against static typic.  This is not a coincidence:
static typing is one way of building certain kinds of abstraction boundary.

For a good short discussion of information hiding, see 

Steve McConnell, "Missing in Action:  Information Hiding", IEEE Software,
March 1996, p 128.

--Mitch