[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: lisp conf scheme agenda
I cannot make it to the the RRRS meeting on Sunday, 24 July. I am
teaching this summer and cannot take the time off.
I am a graduate student at IU, I have written my own Scheme System with
native code compiler on the VAX.
Here are my comments on Will's proposed agenda found in RRRS-AUTHORS.
Some of my comments are somewhat mundane. Some of them begin to be
profound.
I do not feel that now is the time for a standardized Scheme. There are
too many open issues: continuations, concurrency, macros, environments, etc.
I apologize ahead of time for any typos or opaque writing. I have made a
good deal of effort to remove all such, but I am now feeling rushed for time
to get this out before everybody starts flying to Snowbird Utah.
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> RESERVED WORDS AND PORTABLE CODE. Require that implementations
> provide some way for programmers to work in a syntactic environment
> containing no reserved words other than those found in the report,
> without requiring that this be the default syntactic environment.
I'm not sure this leads to a desirable situation. If the R*RS is so yucky
that other environments are needed to actually do real programming, why even
have the R*RS environment? If nobody's using the R*RS environment for
serious work, then it's just a cute toy.
> DISJOINTNESS OF TYPES. Require that the following sets of objects
> be disjoint: booleans, pairs, symbols, numbers, characters, strings,
> vectors, procedures. Issue: add the singleton set containing the
> empty list to the list. Issue: remove characters from the list.
> Issue: add promises to the list, or flush force and delay altogether
> (see **OTHER PROPOSALS**).
Lacking a reason to make life more verbose by removing all vestiges of
shorthand/overloading ("punning"), I support dividing ALL types/values into
two classes: the class of TRUE types/values, and the class of FALSE
types/values. With this, there should be at least one true value,
designated #t, and one false value, designated #f. I don't care whether
these values are disjoint from other values of the same class or not.
I don't support a distinct boolean type. If there IS a boolean type, then
there should be exactly two boolean values (values interpretable as
boolean), #t and #f, and those should be the ONLY values allowed from the
first sub-expression of an IF. Having a distinct boolean type, and then not
really using it is ugly! Enforcing it adds a run-time error condition check
to IF, which is undesirable.
> ELIMINATE NIL, T. Remove nil and t from the report.
Human engineering is wierd. When I first learned LISP in 1975, NIL
offended me because it didn't look right (like a symbol, not a list). Then
I started actually writing code, and discovered that "nil" was a LOT easier
to see than "()" or "'()". And I've used NIL ever since. When I learned
SCHEME, I (obviously) had to bow to its insistance on calling "nil" a
symbol, but I still use it when I can, simply because its still easier for
me to see!
I propose (half jokingly) that the syntax "nil" denote the empty list and
not a symbol. That is, the single special case of the #! syntax.
> EQV?. Clean up the specification of eqv?, removing the requirement
> that it be true of two empty strings or empty vectors.
My impression of Will's EQV? axioms when I scanned over them is that they
looked an awful lot like EQ? and I didn't see the point (of having something
just barely different than EQ?).
I've always thought EQV? should return TRUE when EQ? "should" (due to
hashing) be TRUE, but isn't (bignums, strings, etc, but NOT lists).
I don't think EQV? has the right to be unspecified on anything computable.
EQ? does have the right since it ONLY compares pointers (without EVER
dereferencing them!).
> DUPLICATE FORMALS ARE IN ERROR. The list of formal variables in
> lambda, let, letrec, and do (but not let*) should not contain
> duplications. E.g. (lambda (x y x) ...) is an error. In the case
> of named let, the formals in the initial bindings list should be
> distinct, but it is ok (albeit useless) for the "name" to duplicate
> a formal. E.g. (let foo ((x 1) (foo 2)) ...) is legal.
Okay. (Pssst! What's the problem?)
> CHANGE WORDING OF LETREC RESTRICTION. From "...without referring
> to the value of any <variable>" to "...without referring to the
> value or location of any variable". Example:
>
> (letrec ((wanna-be-a-doctor 'doctor)
> (imagine-my-surprise! (begin
> (set! wanna-be-a-doctor 'nurse)
> 'zowie)))
> wanna-be-a-doctor)
Closure creation refers to the locations of its free variables. Therefore
this wording bugs me. Why not just come out and say "without STORING or
FETCHING from the location of any LHS-variable". The vague wording above
confuses me, and I know what it means!
> CLARIFY WORDING OF LETREC. Add the word "ALL" as follows:
> "...in a letrec expression, ALL the bindings are in effect while
> their initial values are being computed..."
I'd be happier with a wording of something more like: "All the
LHS-variables exist but are considered unbound while the RHS-values are
being computed. Only when all the RHS-values have been computed are the
LHS-variables considered to the bound to their corresponding RHS-values.".
> CLARIFY MEANING OF QUASIQUOTE. As in Pavel Curtis's proposal.
I missed Pavel Curtis's proposal (sorry Pavel!). However, my I already
feel that quasiquote should produce all fresh pairs. That is,
`(1 2 3) expands to (cons 1 (cons 2 (cons 3 '())))
And I will (and have) rewrite(n) quasiquote to have this behavior in scheme
systems that don't have this behavior.
> CLARIFY STATUS OF CI. Though CI is described as a suffix,
> it is generally just embedded.
No opinion.
> CLARIFY THE SPECIFICATION OF TRUNCATE.
> IMPROVE THE DISCUSSION OF EXACTNESS AND INEXACTNESS.
> CHANGE THE SYNTAX OF NUMBERS.
> CLARIFY THE STATUS OF EXPONENTS.
> EXPONENTS ILLEGAL IN FRACTIONS.
> EXACTNESS AND INEXACTNESS OF CONSTANTS.
No opinion. I haven't worried about numbers. Yet.
> **PART 2: POLICY ISSUES**
> FLUSH OPTIONAL FEATURES. Do away with the distinction between
> essential and optional features. In effect, make everything
> essential. Issue: some optional features should be dropped rather
> than made essential; which?
My Scheme system only has integer fixnum's. I feel that for my system to
be a "real-life scheme", it needs at least integer bignums. On the other
hand, I don't feel in any hurry to add inexact real bignums. Though if I
did, it would be nice if I had a standard to match up with. A library of
standard algorithms would be nice too.
> MAKE PROCEDURES MORE REGULAR. Add vector-copy, list-copy,
> list-fill!, list-set!, (make-list k); or remove procedures
> to make them more regular across the different kinds of
> sequences. Issue: what's the policy? Issue: generic
> copiers, fillers, etc. Issue: optional or essential?
It would would be nice to have a consistant set of fundamental operations.
However, though a few years back I was much in favor of generic copiers,
length measurers, etc, I am now strongly in favor of type-specific ones.
(LIST-LENGTH, STRING-LENGTH, etc). COPY is a little stranger, in that you
might desire to copy heterogenous data structures, and in that case a generic
COPY would make sense. Otherwise, a pair-specific copier still is useful.
> UNDERSPECIFICATION. What kinds of underspecification are
> desirable? What kinds are undesirable?
No unspecification is desirable. Optimizers should be completely
invisible. Scheme programs should have the same behavior on all scheme
systems. I should never have to consider "the optimizer" when I am
reasoning about my program (unless I EXPLICITLY set an optimizer level, and
in this way become implementation dependent).
*************************************************************************
* *
* Let me restate that as a general principle for emphasis *
* and clarity: My Scheme program should only become *
* implementation dependent if I explicitly say so, *
* otherwise the scheme program should be completely *
* portable across all scheme systems in syntax and behavior. *
* *
*************************************************************************
The only reasons for underspecification I've heard are to allow
concurrency and to allow stupid optimizers. Concurrency is out of the
question since then is no protection mechanism for critical sections. The
concept of condoning stupid optimizers horrifies me since I can imagine with
horror one that started inlining code, and then interleaving the evaluation
of sub-expressions "for efficiency". PERMUTE is completely undefined, you
know. No axioms, nothing.
Speaking of PERMUTE: you do realize that the lack of a definition of
PERMUTE makes this touted denotational semantics incomplete to the point of
uselessness? If non-determinism is so easy to reason about, how come the
denotational semantics can't? How am I supposed to do my proofs with a
missing fundamental and much used definition?
Query: What does:
(begin
(set! x 1)
(list (set! x (* x 3)) (set! x (+ x 4)))
x )
mean? (3, 5, 7, or 15?) What does:
(call/cc
(lambda (k)
(list (k 1) (k 2) (k 3)) ))
mean? (1, 2, or 3?)
> **PART 3: MAJOR PROPOSALS**
> MACROS. No complete proposal is ready for consideration. The
> macro and extend-syntax syntaxes have been proposed as least
> common denominator(s) to tide us over until we have a real macro
> proposal.
I strongly oppose any system that won't allow me to write naive macros.
Renaming my symbols without my express permission and not telling me is not
human engineering (macro substitution is not necessarily beta-reduction!).
It simply does not scale to tractable proofs of large macros. Other than
that extend-syntax suits me fine if it remains disjoint from the macro
expansion facility itself.
> MODULES.
> REPLACE LOAD WITH INCLUDE. The load procedure is convenient for
> interactive program development, but its dependence on an object
> with state (the external file system) makes it less satisfactory
> than include would be for describing complete programs.
> Presumably implementations would retain load as part of their
> programming environment even if it were replaced by include in
> the report. Issue: An alternative is to change the meaning of
> load in the report. Issue: Include also depends on an external
> file system; the only difference is that the dependency is removed
> at compile time rather than run time.
I oppose making *interactive* program development harder in favor of batch
processing. Both LOAD and INCLUDE have their uses. Both should be allowed.
Neither can provide the functionality of the other.
The top-level expressions in a file being LOADed should have exactly the
same semantics as typing the expressions in, one at a time, from a
read-eval-print loop. INCLUDE doesn't have to have this semantics.
In fact, I'll claim that LOAD starts up a fresh read-eval-print loop,
reading from a file instead of a keyset, and writing to oblivion instead
of a screen. It terminates upon EOF.
I conceive of INCLUDE more like a macro that returns that contents of a
file (probably wrapped with a begin).
> FIRST CLASS ENVIRONMENTS. In my opinion, no proposal for
> environments remains on the table. A proposal could be built
> around make-environment, the-environment, build-environment,
> Guzowski's with, or Pavel's recently withdrawn proposal.
> ADD EVAL. Add eval to section 6.10.4. Issue: one or two arguments?
> If two, how do you get an appropriate second argument?
> ADD SECOND ARGUMENT TO LOAD. If eval is added and takes a second
> argument, then shouldn't a second argument should be added to load
> also?
The fundamental abstraction is a read-eval-print loop (REP loop). A REP
loop reads from a stream of expressions (usually a keyset or a file (with
LOAD)) and writes a stream of values (usually a screen/window, or a datasink
(with LOAD)). A REP loop reads expressions, evaluates them for value AND
effect, outputs their values, then starts over again. This is a very
iterative, side-effecty approach, and I class it as one of the activities of
Meta-Programming, of which debugging is also a member. EVAL means nothing
without a REP loop, or something like it.
In particular, I view "define" as a command to a REP loop to add or modify
a symbol to its list of definitions (its extendable "base environment").
So called "local defines" as they are presently defined are therefore a
complete misnomer. They have a completely different meaning and use from
"top level defines". I would much prefer to refer to local defines with the
keyword "declare" and NOT "define". I don't care what "declare" means at
top level, I don't use "declare". I do use "define" as described above.
> ADD LAMBDA*. Adopt the Hieb/Dybvig proposal for procedures that
> dispatch on the number of arguments they are given, for allowing
> multiple values to be stored in variables, and for the & syntax.
> This proposal is an alternative to the next two proposals.
I used Bruce Duba's "match-lambda" to write the variable argument
procedures in my scheme system. I used the:
(define! write-char
(match-lambda
[ (char) (write-char char (default-output-port)) ]
[ (char port)
(die-if-not-port port)
(die-if-not-char char)
(syscall #!writec port char) ] ))
style to not replicate code. However, I did observe that this programming
approach uses MULAMBDA and thus eats PAIR's and causes garbage collections
(and my version 0 garbage collector is SLOW).
APPLY forces its list argument into the form of an actuals-list (which is
isomorphic to a finite (acyclic) proper list). MULAMBDA (lambda w/ #!rest)
converts actuals-lists to pair-lists. So there is already an internal
actuals-list representation, which causes all this gyration with APPLY and
MULAMBDA. Punning the actuals-list with multivalues-lists seems natural.
HAVING to convert back and forth from pair-lists is inefficient and
unnecessary.
I like the "&" proposal, but I haven't formally worked out the semantics
nor implemented it yet.
> MULTIPLE RETURN VALUES. Add optional procedures values and
> with-values such that (values x1 ...) returns values x1 ... and
> (with-values thunk proc) calls proc on the values returned by
> thunk. Issue: what do continuations that currently expect one
> return value do when given zero values or more than one value?
> The most popular answers are:
>
> a. It is an error.
> b. Ignore extra values; it is an error if there are no return
> values.
> c. Ignore extra values; the continuation gets an unspecified
> value if there are no return values.
> d. Ignore extra values; the continuation gets #f if there are
> no return values.
Unless the continuation takes variable numbers of arguments, it is an
error for the wrong number of values to be given to it. General principle
holds that adding and ignoring values is bad. Examples might convince me
otherwise. Annotation to specify which of many values is desired is easy.
> OPTIONAL ARGUMENTS. Add an optional #!optional syntax to lambda
> expressions to support optional arguments:
>
> (lambda (x #!optional y z . w) ...)
>
> If not supplied, y and z are bound to new locations holding a
> special default object. Add the procedure default-object? so that
> (default-object? y) is true if y is not supplied. It would be
> possible to pass the default object as an actual argument, thereby
> making it possible to obtain the default for y while specifying z
> explicitly.
I tried using a default-object approach for my variable argument
procedures. I dropped it in favor of the multiple formals list approach
described above, as it was clearer and shorter.
> RECORD OBJECTS. Jonathan's proposal:
No opinion.
(Later) I lied. I think RECORD OBJECTS should be a user-library module.
The need of RECORD OBJECTS to create objects of TYPE disjoint from all other
TYPEs is a very strong argument in favor of providing user defined types.
(Proposals anyone?)
> **PART 4: OTHER PROPOSALS**
> CHANGE WORDING OF SET! RESTRICTION. From "<Variable> must be
> bound in some region enclosing the set! expression or at top
> level" to "<Variable> must be bound in an enclosing lexical
> environment". Issue: need examples of code that illustrate
> the intended differences between the two wordings.
If you distinguish between "define" and "declare"... (see my discussion
elsewhere) I want to define globals while embedded in LETRECs.
Also, this doesn't seem too friendly to the interactive user (who your
already dumping on anyway, but...).
> FORMAL SEMANTICS FOR NUMERIC CONSTANTS. As proposed by Clinger,
> who doesn't think it is necessary.
With fifty different types of numbers, something formal might be nice.
> ELIMINATE DO. Remove do from the language.
Won't bother me...
> ELIMINATE NAMED LET. Remove named let from the language.
I use REC and RECUR.
> ELIMINATE =>. Remove the => notation from the language.
Never used it. (later:) Eeek. Never will either.
> ELIMINATE CASE. Remove case from the language.
HUH?!?
> ELIMINATE FORCE AND DELAY. Or make promises disjoint from other
> types.
I've decided to make suspensions (ala Friedman&Wise, 1976) disjoint from
other types in my Scheme System as I like to make streams of closures, and I
can't tell the difference between a closure and a suspension realized as a
closure. And this is a problem.
> ELIMINATE LAST-PAIR. Remove last-pair from the language.
I've used it. I can write one myself.
> ELIMINATE NUMERIC FORMATS. Eliminate (or redesign or simplify)
> the format arguments to number->string.
No opinion.
> ELIMINATE CALL-WITH-CURRENT-CONTINUATION. Remove
> call-with-current-continuation because it makes programs hard to
> understand.
I don't use CALL/CC (well I haven't for quite a while). I DO use
Continuation Passing Style, and my programs are probably hard to read. But
previously intractable programs are easy to write. I DO support functional
continuations and prompts, and have already (ab)used engines a number of
times to partially achieve this functionality (when I couldn't convert to
Continuation Passing Style).
> DEFINE WITH NO EXPRESSION. Optionally allow (define x) for top level
> definitions only. Issue: why not for internal definitions? Issue:
> If it becomes available for internal definitions, then it should also
> be available for letrec, let, let*, named let (?), and do (??).
Do NOT allow "define" with no expression. Do allow "declare" with no
expression. See my earlier discussion. No such thing as local defines.
Allow local "declare"s with no expression.
Symmetry arguments indicate that LETREC et al should allow empty
expressions. I wouldn't (and have never needed to) use them.
> ADD CALL. Add essential syntax: (call proc arg1 ...).
> ADD STATIC. Add essential syntax: (static id).
My compiler converts to this form. Almost. It also puts all constants
inside of QUOTE expressions, I'm contemplating recognizing tail recursion at
this point in the compilation process by splitting CALL into CALL and GOTO,
and I already split lexical variables from globals with GLOBAL and LEXICAL
(instead of static). I also make all variable locations explicit, but I am
probably going to retract that later on, unless I get a good algorithm for
detecting possibilities of stack-and-not-heap allocation.
So I might be against it because its not the way my compiler would want it.
Supporting code traversers this explicitly might be a bad idea anyway.
> PEEK-CHAR. Add a peek-char procedure. Issue: essential?
Would be cleaner and more robust than writing it yourself. I'd rather
worry about the I/O package in it entirety, though.
> ADD EQ-HASH, EQV-HASH.
No opinion.
> ADD DAYS-AFTER-J2000.0.
I favor a full fledged date treatment system/library.
> VALUE RETURNED BY ONE-ARMED IF, COND. Specify that (if E0 E1) returns
> #f if E0 is false, and that (cond ...) returns #f if no clauses apply.
I don't like one-armed if's (I use WHEN and UNLESS). I dislike
unspecified values. #f is fine. Just be sure to maintain tail-recursion
(Hi Bob!(:-)(:-)(:-)).
> VALUES RETURNED BY SIDE EFFECTS. Change the semantics of
> (set! x e), (set-car! x e), (write e p), et cetera, so that
> they return the value of e. Issue: why not return #!unspecified?
My Scheme System returns "e". #!unspecified is stupid. If you are
"lucky" it induces a run-time error right away. If you aren't, and didn't
throw it away, then your program is a time-bomb. Either using the value of
a side-effecting expression causes a COMPILE-TIME error, or it returns a
legitimate value -- anything else is bogus.
Anyone who doesn't use the value (ie throws it away) doesn't care what the
value is. #!unspecified doesn't even help him, really, in the event that he
slips. Anyone who wants to use a value needs something reasonable, and
#!unspecified isn't. The situation is not symmetrical.
> LEFT TO RIGHT EVALUATION OF ARGUMENTS. Change the semantics of
> procedure calls so that expressions are evaluated from left to
> right.
It would make program proofs easier. Combinatorial explosions tends to
make things intractable. See my previous arguments.
> LEFT TO RIGHT EVALUATION OF DEFINITIONS. Change the semantics
> of internal definitions so that they are evaluated from left to
> right. Issue: do the same for letrec?
Evaluation of RHS expressions should occur Left to Right. Binding of RHS
values to LHS variables should occur in no particular order, though it would
be nice if the binding occurred after ALL of the RHS expressions were
evaluated.
> CHANGE THE SPECIFICATION OF AND. Change and so it always returns
> #t or #f. Issue: the last position of an and becomes non-tail-
> recursive. Issue: If this proposal is not adopted, must and
> return the first false value it sees, or can it simply return #f?
Why brain-damage AND? Last position should be tail-recursive. Non-#t
values are okay, as are non-#f values. I don't believe in a distinct
boolean type anyway.
> RENAME CHARACTER COMPARISON PREDICATES. Change char=? etc to
> char= etc.
I use "char-eq?", "char-ne?", "char-ge?", etc. Its easier for me to see,
but its non-standard, which causes boot-strapping problems.
My feeling is that "=", "<>", "<=", etc don't need the "?" because they
already consist of special characters, and culturally are understood to
return true and false values. This argument presumably extends to relations
with alphabetic prefixes.
> RENAME SET! TO SET.
No. Why?
(Later:) LET is the non-destructive version of SET!. Seriously!
(That's the way I think about it sometimes).
Chez Scheme's Fluid Bindings are another non-destructive version
of SET!.
If I had first-class locations, I might have a third version.
> RENAME SET-CAR! TO SET-CAR, SET-CDR! TO SET-CDR. Issue:
> non-destructive versions make sense even though they're not in the
> report.
Not for destructive versions.
Keep the bang ("!") notation for all. Pronounce it as "bang" and not as
"destructively". Easier pronunciation might make you feel better.
> RENAME CHAR-READY? TO READ-CHAR-READY?. Issue: name:
> READY-TO-READ-CHAR? Issue: replace instead
> with read-char? (TYI-NO-HANG)?
READ-CHAR-NO-HANG. For some reason I'm concerned about non-atomicity.
> RELAX THE LETREC RESTRICTION. Require implementations to find
> an order that works, if any does. Issue: not computable?
Topological sorts of dependency graphs is certainly computable. I think
is even linear in time. The problem is what to do with cycles.
I'm happy with LETREC the way it is. If somebody wants to topsort, and
solve cycles, let them write a library macro called something else, and make
it publicly available, I won't mind though...
> DON'T SPECIFY WHETHER THE EMPTY LIST COUNTS AS TRUE OR FALSE.
I don't believe in booleans. So I believe the empty list is a false value.
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> LOOSE ENDS not in Will's note, but in typeset copy handed to be just now:
> ADD TYPE-OF: (from Pavel)
In my implementation, there is an object of each type that doesn't exist.
This is an artifact of the linker, but rather than dismiss it as a hack, and
be rid of it, I kept them around in wait for a use for them. (The singleton
objects: (), #t, #f, etc are all of different primitive types, but are
POINTER-EQ? to the ghost-object of their types).
I was thinking that these ghost-objects would be ideal first-class types,
which implies that a first class type is of its own type:
(pair? (type-of (cons 1 2))) ==> #t
I even defined a predicate TYPE-EQ? that compares types of two objects,
which implies:
(define ghost-pair (type-of (cons 1 2)))
(define pair? (lambda (x) (type-eq? x ghost-pair)))
This seems rather implementation specific, but thought I'd throw it out
to see if anyone could abstract it to something interesting.
> CHAR renamed to CHARACTER (pavel):
I actually wouldn't mind INT, BOOL, and RAT. But then thats largely
because my screen is only 80 characters wide. I also find very long
identifiers hard to read.
I guess it comes down to: if you renamed CHAR to CHARACTER, I'd probably
rename them back.
OH! The real reason is that INTEGER, RATIONAL, and BOOLEAN are special.
That is, we don't have INTEGER+, INTEGER>=, RATIONAL*, RATIONAL<>,
BOOLEAN-AND, BOOLEAN-NOT, etc. In this sense CHAR is not special, and hence
is worthy and needful of the shorter name.
> CHANGE EXPONENT MARKER from E to ^.
It would be confusing with exponentiation in other languages...
I like the thought, though.
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> LOOSE ENDS not found in proposal anywhere:
> CASE STUPIDITY. Is case-insensitivity while reading symbols sane? Is it
> human engineering? Does it open whole new realms of error making? Is
> ASCII too powerful? Wouldn't Morse Code be simpler? Do you miss the
> grand old days of FORTRAN where even spaces weren't real?
At the same time I was chafing against the limitations of only two cases
of letters in ASCII, case-insensitity is foisted upon me. To me,
case-insensitivity is a hallmark of ANCIENT, not modern programming
languages. But be that as it may, case-insensitivity CAUSES errors.
I cannot believe that someone can think that "FOObaz", "FoObAz", "fooBAZ",
"foobaz", "FooBaz", "FOOBAZ", and "fOOBAz" are all the same identifier.
I've had to deal with programs that spelt keywords and variables a multitude
of ways, and it was just one more indication of the overall downright SHODDY
design. It was just another way to be sloppy, and the compiler allowed it.
I have two proposed solutions:
(1) Case Consistancy: the identifier must be refered to with the same case
signature across all usages. To refer to both "FOObaz" and "fooBAZ" is
a compile time error (NOT a read-time error!).
(2) (EQV? 'FOObaz 'fooBAZ) ==> #t, (EQ? 'FOObaz 'fooBAZ) ==> #f
have the compiler use EQV?, and let the macro expander use EQ?,