[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Comments on the draft standard



    I recently grabbed a copy of the standard dated April 15 from
    zurich.ai.mit.edu.  Since I am new to the scheme community, it would
    be presumptuous to assume that I have understood the subtleties of all
    of the standard, but there are some areas that were unclear to me or
    seemed to permit contradictory interpretations, and I thought I would
    send out comments on these areas.

Which standard?  Do you mean the ieee standard or the informal r4rs
report?  The scheme-standard mailing list deals with the ieee
standard, while rrrs-authors deals with the informal report.  Many
issues are important to both mailing lists, so sending mail to both is
probably the right thing to do since there are some people on one but
not the other.  I think that your comments apply to r4rs, not the ieee
standard, so the message should have been sent to rrrs-authors as well.

    I commend to your consideration adding an implementation note pointing
    out the advantage of introducing of a special value, #undefined, which
    is "returned" when an evaluation results in an unspecified value.

    Rationale: The presence of such a value substantially simplifies
    debugging by permitting the environment to analyze precisely the
    nature of an error (i.e. the implementation doesn't have to worry
    about debugging random pointers).

    In light of lack of experience and agreement about this construct, it
    would be inappropriate to standardize it at this time, but it would be
    appropriate to mention it in an implementation notes aside.

I mostly agree with your recommendation, but your rationale does not
quite hold.  The "problem" with unspecified values is that they can
propagate for an unbounded amount of time before anyone actually tries
to do something with them that will fail.  For example, they can be
stored in a data structure which will not be examined until much
later.  At the point at which the unspecified object causes an error,
there is often no context from the process that created the
unspecified object, so debugging is not really enhanced.

The return values of certain expressions are left unspecified for a
variety of reasons:

- No agreement within the community on what the return value should
be.  See below the case of SET!

- Agreement that no value is especially useful and/or meaningful.  In
this case the implementation does not have to track down some other
value, and can return something more convenient.

Another possibility would be a class (type) of unspecified objects
which would capture some "essential" aspect of the situation in which
they were created/returned.  This could be made much more useful for
debugging, but would probably incur some performance problems.  I
won't suggest requiring/recommending this.

----------------------------------------------------------------------

    I think there is a conflict implied between this paragraph and
    paragraph 3 of this section.  The conflict is exemplified by:

	    (define mystring "abc")
	    (string-set! mystring 0 #\f)

    The third paragraph as written implies that "abc" can be allocated in
    read-only memory, and since string-set! does not "create" a new
    object, paragraph one implies that read-only memory can be
    side-effected.  See my proposed rewording to paragraph 3.

There is another interpretation, which is that the STRING-SET! is in
error for attempting to modify an immutable object.  The last sentence
of the third paragraph of this section in r3.95rs says precisely this.

A correct analogous program would be

(define mystring (string-copy "abc"))
(string-set! mystring 0 #\f)

An altogether different problem is that of determining whether an
object is immutable.  Maybe an IMMUTABLE? predicate should be added to
the language, so that user written procedures can detect this case and
give a meaningful error message rather than causing an error in the
middle of system code.

----------------------------------------------------------------------

    This paragraph could be misread to imply that objects must be
    implemented with "in-use" bits.  While this is a common implementation
    method, it should be thought of as a conceptual approach, not an
    implementation constraint.  The wording also implies that the in-use
    bits must always be kept up-to-date, which clearly is undesirable.

Would you be happy if the word "conceptually" were inserted before
"marked" at the beginning of the paragraph?

    Other:

    It might be helpful to note here somehow that there is useful
    constraint that is not as strong as "immutable".  There is a class of
    objects named by top-level symbols that may be mutated but not resized
    or rendered out-of-use, and it might be worth adding a paragraph to
    point out that these can always be considered to be allocated.  [Point
    being to partition them from the space considered for garbage
    reclamation].

I don't think that is reasonable for the standard.  Implementations
may support objects in static areas, but I don't think that the
language itself should concern itself with such issues.

----------------------------------------------------------------------

    The result of SET! being unspecified is sufficiently unusual that I
    would be curious as to the rationale.  To my knowledge, no semantic
    difficulties are introduced by having SET! return the value, and there
    is enough code out there that makes this assumption that in the
    absence of a compelling reason to remove it SET! should return the
    value.

There is disagreement in the community about what the return value
should be.  MIT Scheme follows the convention that assignment (ie. SET!)
and slot assignment procedures (ie. SET-CAR!, STRING-SET!, etc.)
return the OLD contents, not the new value.  Other implementations
follow the more usual convention of returning the new value.
Code making the assumption that any particular convention holds is
not portable.

----------------------------------------------------------------------

    The standard does not specify the meaning of

	    (define <variable>)

    which is in sufficiently common usage that it should be included.
    <variable> should be bound to an unspecified value.

I agree, but I think there was some discussion about this, and it was
dropped, but I don't remember the reason.

    Also, the semantics of DEFINE is historically problematic.  The
    change in R3RS that eliminated the implicit LETREC in defines forces
    me to generate code that checks to see if the procedure's variable has
    been side effected by someone I call before I do the tail-recursive
    call.

That's not quite true.  This behavior can be achieved by the following
implementations:

- Fetch the operator of the call by indirecting through a cell.
Assignments modify this cell.
- Assignment becomes "smart" about modifying the operators of certain
calls, and "does the right thing".

Furthermore, a programmer who wants to preclude the possibility of
assignment to the variable can do so by using

(define foo
  (letrec ((foo (lambda ...)))
    foo))

or

(define (foo .args.)
  (letrec ((foo (lambda ...)))
    (foo .args.)))

    If this change is made (I am sending this proposal to the scheme group
    too), then the common case rewrite rule is:

	    (define name	  =>  (begin (define name)
	       (lambda ...))	     (set! name
					   (letrec ((name (lambda ...)))
					      name)))

Note that here you are suggesting something even more drastic than
what r2rs had.  In r2rs

(define foo (lambda (x) ...))

and 

(define (foo x) ....)

were NOT equivalent, since the second form had in implicit LETREC in it.

I'm also not sure that I agree with you that the expansion with LETREC
is desirable as a default.  The differences between the two cases
above in r2rs and this question about desirability are the reasons
that the semantics was changed for r3rs.  Although your proposal would
make the meaning of both forms be the same, I suspect you will find a
fair amount of opposition (including myself) to having an implicit
LETREC in

(define foo (lambda (x) ....))

----------------------------------------------------------------------

    It should be specified whether the standard-procedures are considered
    immutable or not.  I could make a good compiler argument for them
    being immutable and a good user argument for them being mutable (can
    add polymorphism through appropriate SET! and LET hacks).  It is my
    opinion that this decision should not be left to the implementation.

I don't know what you mean about procedures being immutable.  Do you
mean mutating the procedure object or changing the value of the
variable which initially holds a standard procedure?  The language as
it currently appears provides no means for destructuring and mutating
procedure objects.  As far as assigning those variables, I think it is
allowed by the language.  My understanding of the paragraph on
immutable data is that only literal constants can be made immutable,
but there are no literal bindings.

----------------------------------------------------------------------

    Write should return #f on failure and #t on success, or it should be
    an error to write a value to an external port that lacks an external
    representation.

There are two different issues here:

- One is whether WRITE should do something useful with all objects.

- The other is WRITE/READ invertibility, that is whether the output of
WRITE should always be valid input to READ and should cause READ to
return/create an object similar (same?) as that given to WRITE
originally.

Although the report does not explicitely state it (maybe it should),
I think that the authors agree that WRITE should accept all objects as
arguments and "print" something descriptive.

I don't think there is agreement that WRITE should be able to "print"
a "complete" representation of a procedure or a continuation, or what
this would mean.  At the core of the matter is the fact that although
the notion of procedural equivalence is mathematically well defined,
it is not decidable, so there would be no way to effectively test a
WRITE/READ pair and ultimately to "know" what to print.  Any other
definition of similarity between procedures would be artificial and
likely to cause confusion and problems.

An altogether different issue is whether we should adopt the C
convention of returning an arbitrary value when an unhandled situation
occurs or whether we should signal an error.  In general I don't
believe in returning arbitrary values as a substitute for errors,
although I think there are some cases (like STRING->NUMBER) were we
can make an exception because of common usage.  I don't think that
WRITE falls in this category.

    The standard shouldn't include DUMP-HEAP, but it shouldn't constrain
    WRITE in a way that prohibits it either.

I think this is already true.  Although I doubt that DUMP-HEAP would
be written in terms of WRITE.

----------------------------------------------------------------------

    1) Implementations should be encouraged somewhere to implement (WRITE
    <continuation>).  A representation should be written that when read
    with READ results in a thunk that is an equivalent function to the
    written continuation.  The representation of continuations should be
    implementation-defined.

I don't know what that means.  What do you mean by an equivalent
function?  Should all environment information be "re-interned" on the
way in so that side effects by the original continuation are visible
by the "new" continuation and viceversa?

    Rationale:  We all acknowledge that this is pragmatically something
    that we want, and there should be a standard interface.  Having write
    return a value lets me assume the capability without causing problems
    for implementations that don't support it.

I'm not sure I want something that I don't understand.  It would have
to be very well and clearly defined for me to accept it.

----------------------------------------------------------------------

    2) I believe that the user-interface section should specify a
    procedure (DUMP-WORLD "filename" <continuation>) that writes a
    representation that can be executed in an implementation-dependent way
    to arrive at the specified continuation.  The <continuation> argument
    should be optional, and if left unspecified should default to the
    continuation that DUMP-WORLD will return to.  DUMP-WORLD should return
    #t if it succeeds or #f if it fails or the implementation doesn't
    supply the functionality.

As I said above, I don't believe in procedures returning arbitrary values rather
than signalling errors.

    Rationale:  The whole community acknowledges that we want this, but
    that it is implementation-dependent.  This proposal permits us to
    specify the interface to it and construct our programs in a way that
    deals gracefully with failure and doesn't impose it on
    implmementations that don't or can't do it.

    Why DUMP-WORLD and not DUMP-HEAP?  Unlike DUMP-HEAP, which is covered
    by WRITE, DUMP-WORLD does something fundamentally
    implementation-specific.

In general, we decided to drop most user interface and development
environment procedures because we could easily envision environments
where they did not make sense.  The current consensus is that most of
those procedures should not be in the language.  Programs that depend
on this kind of functionality can be structured so that the
dependencies are restricted to a few modules which can be easily
rewritten for similar environments.

I think that LOAD and the transcript procedures have been dropped from
the draft ieee document for this reason.