[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Full Specification



Date: Tue 24 May 88 13:13:21-EDT
From: Morris Katz <MKATZ at A.ISI.EDU>

    Overspecification should be avoided as it artificially
    prohibits reasonable implementations.

Date: Thu May 26 07:49:42 1988
From: Jonathan A Rees <JAR@mc.lcs.mit.edu>

    Underspecification effectively amounts to giving an axiomatic semantics
    (proof system) for the language, as opposed to a denotational one (a
    model).  The problem with underspecification is that one can have a
    program that works fine in one implementation and fails in another, so
    the only way to produce portable code is by reasoning about it, and
    testing doesn't help you a bit.  While I agree that some
    underspecifications are important, I believe you can go too far in
    underspecifying things that really don't make much difference, requiring
    reasoning in situations where testing ought to be sufficient.

    Underspecifications makes the language harder to learn and use, since an
    implementation cannot teach them very well.  One must be aware of all
    underspecifications all the time.  I think they should be minimized, not
    maximized.


This debate should be carried further.  Currently I don't see a coherent 
policy on full specification (one might say it is underspecified).

Three sources of underspecification:

   1. Failure to achieve consensus on what a full specification should be.
   2. A desire to not unduly impede implementations.
   3. Attempting to simplify the specification by ignoring accidental or
      unessential details.

The first is unfortunate, and we should do better.

The second is reasonable but not compelling.  Language designers
should don their user hats.  From the standpoint of the user implementation
considerations are significant only if they greatly affect speed, memory
usage or reliability.  Thus implementation considerations should have veto
power and perhaps be allowed to vote in case of a tie, but otherwise
stay in the background (like a good implementation).

The third is the touchy one.  It is a counter argument to Rees' claim
that underspecification makes a language easier to learn.  Leaving out
incidental details may actually make it easier to understand and use a feature.
For example, assignments are used mainly for effect---the value returned
is usually not important.  However, once specified, it will be used.
Thus users will in the end be forced to learn and remember all specified 
details (if you don't use them someone whose code you must read will).

The other side---reasons for full specification:

   1. Consistency across implementations, allowing one to test and port
      programs with reasonable assurance.
   2. Full compliance with a semantic model.
   2. Use of the specified details.

Number one is a large win.  We are still at the point (and undoubtably always
will be, at least for the life of Scheme as a living programming language)
where large programs can only [reasonably] be shown to be correct by testing.
The argument that correct programs do not rely on unspecified features is
not reasonable for large programs.  Programmers make mistakes, and will
write programs that happen to rely on unspecified features of a given
implementation.  Some of these mistakes are so easy to make one could call
them traps.  For instance, one really expects the definitions in a body to be
accomplished in order.

Number two: the history of Scheme has been intimately tied with denotational
semantics---the value of a full formal semantics for Scheme should not be
underestimated.

Number three is the least compelling.  Typically one has no use for the
unspecified aspects and can provide for them when necessary, arguably in
a clearer fashion.
                                 
Conclusion: full specification is best.  In most cases there are reasonable,
consistent specifications that are as easy to remember (and specify) as
``not specified.''  If it is an error to rely on something that will 
in fact end up being specified in a given implementation we should make it a 
signaled error.

Following are some specification proposals:

Procedures with side effects return the value of the ``source'' expression.
For instance, all of the following would return the value of e:
   (set! x e)
   (set-car! x e)
   (write e p)
There are of course other possibilities; the above has the advantage
of naturalness and symmetry across a broad class of expressions.

The expressions in a procedure call are evaluated from left to right.
This is perhaps the trickiest underspecification in Scheme.  It can be
extremely difficult to verify that order of evaluation has no effect in a large
program.  A reasonable order is left to right, since that's the way bodies
are evaluated.

Definitions are also evaluated left to right.
    (let () (define x e) (define y f))
means
   (let ((x <undefined>) (y <undefined>)) (set! x e) (set! y f))
where referencing <undefined> signals an error.

And of course there are more, but that is enough to test the water.