[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[willc: preliminary report of workshop]
- To: willc%indiana.csnet @ CSNET-RELAY
- Subject: [willc: preliminary report of workshop]
- From: Kent M Pitman <KMP @ MIT-MC>
- Date: 6 December 1984 22:02-EST
- cc: SCHEME @ MIT-MC
- In-Reply-To: <8411120256.AA05811@iuvax.UUCP>
Ok, Will, I'm assuming things decided at the meeting are not cast in
concrete. I certainly take issue with a number of the decisions.
In a few places, I also pick on your notation a bit when I think
its unclear. Thanks for taking the time to write up everything.
Here come the comments...
* I dislike the description of "optional features". In particular,
you say:
Optional features may not be supported by every implementation,
but those that do support a feature will use the same syntax and
semantics for the feature. Hence code that makes use of optional
features will run on any implementation of Scheme that supplies
the optional features.
....
An implementation may extend the language in any way whatsoever,
but code that makes use of extended features is not portable.
These two paragraphs are in conflict and allow for lots of
inconsistency which I will identify at appropriate points later.
The principle troubles, though, are:
* Since a language can be extended in "any way whatsoever",
can such extensions be syntactically in conflict with optional
language features? eg, if #\ is optional and my dialect doesn't
use it, can I then define my own #\ as an extended part of the
required subset?
* In some cases, "required" language features can be redefined
incompatibly by "optional" features. eg,
(COND (FOO => X))
has a well-defined semantics regardless of whether the optional
"=>" feature of COND is present. However, that semantics is not
the same in the two cases.
The point I'm trying to make is that saying something is "optional" says
little if anything unless you at least define that if anyone uses the
syntax corresponding to the optionality in a dialect which doesn't support
the feature, that he is in error. Put another way, extensions may not
invalidate optional features. On the other hand, if this were stipulated
the list of "optional" features to which I objected would be quite lengthy.
* I disagree with calling the string quote character "double quote".
I prefer "doublequote". Since there is a character named "quote", the
phrase "double quote" might designate '' instead of ".
* In discussing what terminates tokens, you should say which of these
characters (presumably all) are also single-character tokens. In
particular, that "((A B)C)" is tokenized
{"(" "(A" " " "B" ")C" ")"}
or {"(" "(" "A" " " "B" ")" "C" ")"}
hangs in the balance. I'm sure no one disagrees, but if you're not going
to be complete about these things, you are sort of wasting your time
just trying to look formal about things.
* I agree with reserving {, [, ], and }, but I would specify that they
may be used as alphabetic according to syntactic escape conventions.
Optionally, the following characters may be delimiters that
terminate symbols:
* What does it mean to say single quote, backquote and sharpsign
may "optionally" terminate tokens? It means expressions
like (JOHN'S COAT) read differently in the different dialects. How
is this distinct from saying "Optionally, the following characters
may be delimiters that do not terminate symbols:"? It only makes
sense to say something is optional if it's going to mean that when
it's present. I bet some dialects treat (JOHN'S COAT) as a two-list
and others treat it as a three-list. Hence, any description of this
relation between dialects can at best say: "The specification takes
no stand on the issue of whether the following delimiters terminate
symbols. Any use of expressions like (JOHN'S COAT) are to be considered
non-portable."
* I would personally prefer if vertical bar had been defined to be
alphabetic, but I am at least happy that it is "not specified" what
its meaning is rather than that it is "optional".
* I strongly oppose the idea of not specifying an escape char for
symbols. You say there is "widespread agreement that ``slashification''
of characters within symbols is a relic that ought to be abandoned."
I am not party to such agreement.
I strongly oppose the idea of eliminating slashification. The Maclisp
conventions of vertical bars made up for the absence of strings. With
their passing, syntactic quoting of lots of chars is very rare, and
in those few cases, I think slashing works fine. It also is a low-cost
mechanism for the printer, since no lookahead is required. Also, it is
uniform with respect to strings, which already use slash for special
chars anyway. Finally, without slash, there would be no way to get
vbars into symbol names, since vbars cannot adequately quote themselves.
On the other hand, slash can work fine in the absence of vbar. So if
one thing is to go as a readsyntax quoter, it should be vbar.
* Will-- Your meta-syntactic use of "..." in a description of what
the "." character does is very confusing.
* Does anyone mind if (. A) is the same as A? It has a certain elegance
to it if you think about it.
(CONS 3 (CONS 2 (CONS 1 0))) => (3 . (2 . (1 . 0))) => (3 2 1 . 0)
(CONS 2 (CONS 1 0)) => (2 . (1 . 0)) => ( 2 1 . 0)
(CONS 1 0) => (1 . 0) => ( 1 . 0)
0 => 0 => ( . 0)
Just a thought.
* I find #!true and #!false to be ugly and visually confusing
with the popular convention of "!" designating something
destructive. It would make more sense for #! to be saved for
something like #. in Common Lisp. I agree #<something>TRUE
is reasonable. I'd have preferred #: for this.
* What does it mean to say:
"Optionally, binary numbers may be written using the #b notation.
Optionally, octal numbers may be written using the #o notation.
Optionally, decimal numbers may be written using the #d notation.
Optionally, hexadecimal numbers may be written using the #x notation."
Presumably this means that dialects not wanting #B, #O, etc. can
use these to mean other things.
* What does it mean to say:
"Optionally, special characters may be written using the #\
notation. If this feature is supported, then the Common Lisp
names for special characters must be supported."
Presumably this means that if I don't want to be able to write special
characters, I can make #\ do something else. In fact, if I want to
use other names than those used by Common Lisp, I can just "not support"
this feature and then "make any extension whatsoever" to my dialect
such that #\ does something completely different, like understand
a different set of character names.
* Was anything decided about whether #!TRUE and #!FALSE would self-evaluate
or whether they required quoting?
* Was anything decided about whether numbers must self-evaluate or
whether a dialect may require quoting?
* Since "optionally, numbers may be written using decimal points and/or
exponents", does this mean that numbers with decimal points are integers
or floating? Does it mean that if I don't support the feature that
I can take the alternate position?
* I notice that the space of symbol names is highly constrained for
the "required subset". A property, however, that should be required
is that within any given dialect, every interned symbol (no matter what
characters it contains) must have a printed representation which is
read-invertable within that dialect. I suspect that all dialects
do this already anyway, but it should be a guaranteed property of the
language since programmers will tend to depend on such things and should
have a guaranteed semantics backing them up.
* What does it mean to make NIL optionally evaluate to the empty list
or optionally evaluate to false. What happens if I make it false and
then try to run my code in another dialect where it's the empty list
and where the two are not the same thing. The definition of optionality
says that code written in the optional subset will run correctly in
another dialect supporting optional features. It doesn't seem to me
like a good idea to optionally define a symbol as able to take on several
values and then be able to write meaningful code.
* It is specified that "the order of evaluation within an application
is not specified". I would prefer "combination" or "expression" to
"application" as a matter of terminology to avoid confusion with the
application that happens in the APPLY function, which doesn't involve
evaluation at all.
* I don't like the name LETREC; I preferred LABELS. Neither is very
suggestive of anything; the latter is at least a real word.
* Will-- I don't like the use of the term "mistake" throughout the report,
at least without defining it formally. In my dialect, it connotes
an unintentional error and it seems to me that if the user
intentionally did the offending thing, it would not be a mistake.
I would say "error" in its place, or define the term "mistake"
formally early on.
* I disagree with the various forms that claim it to be a "mistake"
to use certain return values, allowing some implementations to
signal an error. I don't agree that such errors can ever be
detected at the language level; I would like a formal description
of exactly when it is believed that such an error could be signalled.
The forms in question are: IF, COND, SET!, DEFINE, DEFINE!, CASE,
SET-CAR!, SET-CDR!, and VECTOR-SET!.
* The semantics of (COND (X => Y)) is messy due to optionality as
described earlier.
* Will-- I would name the ... sequences in definitions of things
like LET, COND, etc which use multiple sequences. I realize you
use them right to left, but that could be made more apparent.
Perhaps ..foo.. instead of ...
* I find the name SET! both ugly and redundant. The "!" convention
as originally created by the T people identifies a destructive
variant of an otherwise-non-side-effecting operation. So, for
example, APPEND and APPEND!, etc. Logically, there could be a
CHANGE-CAR and CHANGE-CAR!, one of which was
(LAMBDA (C V) (CONS V (CDR C)))
and the other which was
(LAMBDA (C V) (SET (CAR C) V) C)
In any case, I strongly think that the primitive for assigment
should be SET and not SET!. In fact, since no one likes assignment
anyway, I don't see any reason why anyone should object to just
leaving this undefined in the standard. It would only discourage
people from writing destructive code. But I would be very unhappy
to see T change the name of SET to SET!. Similarly, I strongly
dislike the name SET-CAR! and SET-CDR!.
* The definition of DEFINE refers to the "top-level" definition of
a variable. I don't believe it's established what "top-level" means,
so this definition is pretty muddy. Further, what is the implication
of this definition upon doing (LAMBDA (X) (DEFINE X X))?
I am very discouraged that the (DEFINE (fn . args) ...) syntax isn't
required. This means that any portable code must be ugly, meaning
no one is likely to ever write truly portable code, meaning this
standard is a farce.
* It is silly to require that there be at least one form in a (BEGIN).
It is easy for macros to come up with situations where there are
no forms to put there and as long as the macro's caller doesn't
depend on the value, it shouldn't matter. The return value of a
BEGIN with no forms should just be undefined.
* The fact that (LET* ...) cannot admit an optional name reveals an
asymmetry which I find very distasteful. I suggest that named LET
be left to implementors as an "arbitrary" extension not to be mentioned
in any common subset.
* I would prefer to have REC be called LABEL. Again, at least it's English.
* I don't see any good reason to have DO not bind RETURN. Can someone
elaborate on that?
* The description of DEFINE inside LAMBDA is inconsistent with the
earlier description of DEFINE as creating a toplevel definition.
I think this should be a non-standard extension. I see no reason to
dignify it with any "optional" status.
* The term "top-level binding" is again completely vague in DEFINE!'s
definition.
* The definition of optionality specified that if an optional feature
was present, the dialect should prefer to call it by the "optional"
name. This is somewhat inconsistent with making SEQUENCE an optional
synonym for BEGIN. Since it is not encouraged for use and is not going
to exist in all dialects, is there any sense to including it here?
* The entire section on datatypes is hopelessly muddled. About the only
useful thing said is that anything which is a first class object must
have unlimited extent.
* In the sentence "There is an object which represents both false and
the empty list", I cannot discern whether that means there may/must
be one/two objects filling that description. Shouldn't we say,
"False and the empty list must be represented as first class objects
and that object {may,must} [not] be distinct." or some such.
* Since datatypes are not declared to be disjoint, it isn't necessary
to mention that characters may be represented as numbers, except perhaps
as a footnote to remind the forgetful reader. Strings can be represented
as numbers, too, the way things are written.
* Was there really anyone who thought streams shouldn't be first class
objects? Since datatypes aren't disjoint and such objects could be
indistinguishable from numbers or arrays or whatever, is there really
a reason to care?
* The unary procedure not should be defined to return "a true value if
its argument is false and a false value if its argument is not false."
... rather than "if its argument is true." for the second part.
* I suggest renaming CALL-WITH-CURRENT-CONTINUATION (or CALL/CC) to just
CONTINUE. eg,
(CONTINUE (LAMBDA (C) (IF (FOO) (F C) (G C))))
Anyone else support this?
* By the way, saying the escape procedure has unlimited extent doesn't say
it can be called more than once. Does everyone agree to either stipulate
that or not?
* If "the unary predicate NUMBER? is true of numbers and false of
everything else" and "the unary predicate INTEGER? is true of
integers and false of everything else", I don't suppose this says
much since types are not disjoint and so strings are not necessarily
not numbers and need not necessarily cause INTEGER? or NUMBER? to
return false. Certainly characters needn't yield false from NUMBER?
or probably from INTEGER?. As such, these predicates are of limited
value.
* Of what point is it to make claims about what "almost all implementations"
will do for real numbers? Either they're required to or they aren't.
The rest belongs in some other document.
* I don't agree that allowing generalization of +, -, *, and / to arbitrary
arity is a good idea or even well-defined. eg, the proper generalization
of - to arity 1 is (- 3) => 3, not (- 3) => -3. Hence, specifying that unary
negation is optional is in conflict with specifying that - may be generalized.
* Will-- The discussion of QUOTIENT/REMAINDER and of CONS/CAR/CDR should
use the word "respectively" in the appropriate places. When I first read
that QUOTIENT and REMAINDER return the quotient and remainder, I spent
an unduly long time flipping back pages looking to see if you'd allowed
multiple values before I realized that it was silly for both these functions
to do the same thing or for that same thing to be what it had first looked
to me like they're doing.
* I don't see why MIN/MAX should be restricted from arity 0. They should
just return the smallest and largest representable numbers. I guess as
long as they aren't defined to signal an error in this case, individual
dialects could be extended anyway.
* It should be made explicit whether (= 1 1.0) is defined to work. Note that
this may be tricky since even (= 1.0 1.0) won't necessarily work if the
1.0's were computed rather than read and have different bit patterns that
are too tiny to make a difference on output.
* It is silly to specify that implementations may "optionally" support
numbers that are non-integers. Why not just define that (NUMBER? x)
doesn't imply (INTEGER? x). That definition wouldn't mean that
every number wasn't an integer, it would only mean that every number
wasn't necessarily a number.
Specifying that "almost all implementations" will support this option
is again silly and might in pathological situations be misleading.
* Is the definition of (TRUNCATE x) really correct? It looks like it must
be screwed up on the negative side near 0. eg, (TRUNCATE -0.5) doesn't
have the same sign as 0.5 does it? Or is there a negative 0?
* The meaning of "interning" a symbol should be specified.
* It should be stated explicitly that CAR and CDR of the empty list
is not defined.
* What's this nonsense about pairs being maybe indistinguishable from
vectors of length 2. Is there a good reason for that? It doesn't really
matter since numbers haven't been defined as distinguishable from
strings either, but it somehow offends my sense of aesthetics to see
this note here. Is this due to some problem with Maclisp HUNK2's or
something unrelated?
* In "The following descriptions use the notion of a proper list. The
set of proper lists is the smallest set satisfying:
the empty list is a proper list
a pair x whose CDR is a proper list is a proper list,
provided (MEMQ x (CDR x)) is false."
I think MEMQ isn't the function that you want, but I find it amusing
to see the language defined meta-circularly in this way (since MEMQ
is almost certainly defined to terminate only on proper lists and may
even want to type-check proper-list-ness).
* Is the function LENGTH defined to err or to not return when given
a circular list? What about an otherwise improper (ie, dotted) list?
* The definition of APPEND is poor. It should be defined with NAMED-LAMBDA
for safety in situations where APPEND gets redefined. Also, its text
description is too windy.
* I see no reason for APPEND! to be defined to possibly side-effect
either arg. This may force lots of needless copying in order to write
provably correct programs. I can't imagine a definition of APPEND!
which would want to destructively modify its last argument.
* All these definitions (APPEND, REVERSE, ...) are ugly due to the
silly restrictive version of DEFINE. I certainly wouldn't want my
students programming like that.
* It should be stated in English what happens if LIST-REF and LIST-TAIL
fall off the end. I assume it follows from the definitions of CAR/CDR
that such is a signallable error.
* There should be MEMQ?, MEMV?, and MEMBER? to match MEMQ, MEMV, and
MEMBER. This enhances garbage collection since if these functions
are only being used for truth value, you don't want to hold pointers
to potentially large list structures. Also, it enhances debugging since
if F is a function on booleans, (F (MEMQ X Y)) will receive true/false
rather than a list or false. Ditto for ASSQ?, ASSV?, and ASSOC?.
* You specify no order of evaluation for MAPCAR. I think you mean no
order of "application".
* I dislike the asymmetry between MAPCAR and MAPC.
MAPC has no defined return value, MAPCAR does.
MAPC has defined order of application, MAPCAR does not.
In short, they have nothing really in common other than they type
of their args. I think they should not be named so similarly.
* I oppose the names MAPCAR and MAPC.
T calls these MAP and WALK, respectively.
The generic form of MAPCAR, which is the only thing for which
arbitrary order of application would make sense (since lists are
only sequentially accessible anyway), has no business being called
MAPCAR.
* With respect to the questions about VECTOR->LIST, I think the
right thing to say is that the conses it returns are mutable,
not that the result is necessarily a "new object", since if the
result is the empty list (eg, from an empty vector), I wouldn't
want the implication to be that
(NOT (EQ (VECTOR->LIST #()) (VECTOR->LIST #())))
since it follows from that that more than one false value must
(rather than "may") be possible.
* What happens if VECTOR-REF is out of range?
* VECTOR-SET! is ugly. It should at least be called SET-VECTOR-REF!
for symmetry with the other SET- things. Personally, I hate the
! and would strongly prefer just SET-VECTOR-REF.
* The relation between OBJECT-HASH and the GC should be specified.
Do things get GC'd if no other pointers exist to them? Also, it
might help to distinguish this kind of "hash" from the number that
comes from SXHASH in Maclisp. It took me a second to realize you
weren't talking about that.