[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Syntactic extensions to Scheme

I would like to point out that I believe that MIT Scheme has more
power in its syntactic extension mechanism than is necessary or
desirable.  In particular, I have found that there are only two
general scenarios for macro use (please correct or expand on this):


The first case is the local use of a macro to eliminate syntactic
repetition.  For example, suppose one wanted to say something like:

(define <var1> (lambda (x) (vector-ref x 1)))
(define <var2> (lambda (x) (vector-ref x 2)))
(define <varn> (lambda (x) (vector-ref x n)))

in this case a local macro could be used profitably, as in (MIT

(let-syntax ((make-ref
	      (macro (var index)
		`(DEFINE ,var
		   (LAMBDA (X)
		     (VECTOR-REF X ,index))))))
  (make-ref <var1> 1)
  (make-ref <var2> 2)
  (make-ref <varn> n))

The combination of LET-SYNTAX and MACRO used here gives sufficient
power to do local definitions like this, and also to have defined the
macro somewhere else, say, and access it later by name, as in

(let-syntax ((make-ref the-make-ref-macro)) ...)

Note that this makes no explicit use of the concept of syntax table,
despite the fact that it is implemented using that mechanism.

The issue of what environment in which the binding value of the
LET-SYNTAX form is evaluated is important here, and in fact is
specified externally by passing an explicit environment to the syntax
expander program as described by Jinx earlier.  But in practice, local
macros such as these require only the normal global environment, and
any local procedures can be defined using something like:

(let-syntax ((make-ref
	      (let ()
		(define ...)
		(define ...)
		(macro ...))))

A particular difference between this mechanism and the DEFINE-SYNTAX
mechanisms described by JAR and Jinx is that, psychologically, it is
less likely that someone would be confused about what environment and
at what time the binding value of the LET-SYNTAX would be evaluated.
Such confusion is still possible, but I believe it is easier to think
of the binding value as being separate from the rest of the code than
in the case where the macro definition appears as just another top
level definition.


The second case is the definition of a whole language of syntactic
extensions, which will be used over a large body of code.  An example
of this usage is my recent implementation of the Edwin editor, in
which a number of macros for defining editor commands, variables, and
modes were used throughout much of the source code.  I believe that
this case is the one that normally provides much of the trouble.

I have taken the following conservative approach (I suspect that this
will draw some good criticism; my only response is that it has proven
adequate to date):

* I have assumed that each type of syntactic "language" will
correspond directly to a particular syntax-table.  The
single-inheritance mechanism used by syntax tables is sufficient for
many needs.

* I have assumed that definition of the syntactic extensions is
syntactically separate from their use.  This implies that the syntax
defined by the extensions is NOT available for use at the time that
the definitions of the extensions are processed by the syntactic
expander.  (I hope that this wording is sufficiently general that it
does not restrict my comments to MIT Scheme's implementation.)

In practice, this means the following thing:  that the syntax
definitions reside in a separate file(s) from the source code that
refers to them, and that the definitions do not become effective until
the file(s) is loaded.

Here is an example of how one would define some syntax:

(define edwin-syntax-table
  ;; This means the parent of this syntax table is the "normal" one.
  (make-syntax-table system-global-syntax-table))

(syntax-table-define edwin-syntax-table 'define-command
  (macro ...))

(syntax-table-define edwin-syntax-table 'define-variable
  (macro ...))

etceta.  Supposing the above expressions to be evaluated in some
environment, then the following code could be evaluated (or compiled
without evaluation):

(using-syntax edwin-syntax-table

  (define-command ("^R Forward Character" argument)

  (define-variable "Comment Multi Line"



The major advantage of using this approach is that the scoping
problems associated with macros are sidestepped, since the
environments in which everthing happens are eval-time environments,
rather than syntax-time environments.  Also, the confusion caused by
mixing syntactic and semantic definitions in the code is eliminated.

It can (easily) be argued that this separation is a drawback as well.
I think that this is more a problem of presentation than anything
else.  In particular, given the standard file-oriented methods of
maintaining systems, there is only one ordering or definitions, which
applies for both editing and evaluation.  More sophisticated systems,
which I believe many people are now developing, would allow the
editing presentations to be separate from the evaluation order.


In summary, I think that the DEFINE-SYNTAX and DEFINE-MACRO mechanisms
in both T and MIT Scheme are not particularly useful, and in fact are
confusing.  Their convenience I believe to be dubious, although I will
understand if there is disagreement on this point.  I do not see that
they add any real power that is not available in the cases I have
outlined above through the other mechanisms I describe.

In passing, I would like to note that I DO NOT believe that scoping of
syntactic keywords should be related to scoping of environment
variables.  I think that the only reason that this has ever been
considered a reasonable idea is because of the (perhaps regrettable)
choice to make the notation used for syntactic forms and combinations
identical.  However, I would be curious to see if KMP's semantics
sheds new light on this issue.

The other issue, about the use of SCode, is not so clear.  In general,
I have tried to assume that syntactic extensions map from one domain
to another.  Assuming that the two domains are separate is a
generality that turns out to have advantages in implementation, as it
allows the use of SCode, while not precluding purely source-to-source
macroexpansion, nor T's method of "absolute special forms".  One of
the examples that Jinx presented violated this principle (see the use
of WHILE-XFORM in the second example).  I am not really sure how to
handle this problem, but the separation of the domain and range of the
syntax mapping seems like a good principle to start with.