[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

recursive macro definitions




Hi everybody!

In the process of trying to understand various problems related to
lexically scoped hygienic macros I came across a seemingly interesting 
question.  I don't know if it has been addressed elsewhere already.

I'm currently working on designing an alternative set of rules for
scoping, particularly at the top level.  This all has to be integrated
with lexically scoped hygienic macros (together with a low-level
escape to non-hygienic macros), a separate compilation facility, and
eventually with a module system similar in spirit to the one I
described a year ago or so.

The problem I want to gather some opinions on has to do with plain old
hygienic macro definitions -- for the case that transformers can be
specified using other macro invocations.

Even R4RS hints that with a low-level macro mechanism it would be
possible to implement SYNTAX-RULES _as a macro_.  This raises some
interesting questions about scoping rules in recursive macro
definitions.

[I stumbled accross this problem when thinking about eliminating
DEFINE-SYNTAX, LET-SYNTAX, as well as LETREC-SYNTAX.  The right-hand
side of a definition would have to suffice for determining the role of
the identifier being defined.  It would also bring back the close
relationship between internal definitions and LETRECs -- to the point
where internal definitions are mere syntactic sugar.  But this is not
the point, since the problem arises even in the case of good old
LETREC-SYNTAX.]

To state it simply:  Are the right-hand sides of a macro definition in 
LETREC-SYNTAX allowed to use (invoke) any of the macros defined by the 
very same LETREC-SYNTAX expression?  -- We have two conflicting
intuitions here:

	1. Yes -- because the definition is recursive!
	2. No -- how can I invoke a macro whose transformer I don't
	   know yet?

Point 2 clearly is quite strong.  Consider:

	(letrec-syntax
	    ((m (m (syntax-rules ()
		     ((_ x) x)))))
	  (m 1))

If M turns out to be defined by the inner SYNTAX-RULES (perhaps
through some fixed-point magic), then, indeed, everything would be
consistent.  This is reminiscent of

	(letrec ((f (f (lambda (x) x))))
	  (f 1))

which, by the same magic could (perhaps) be made to work, but it is
deemed illegal right now.  (I see no practical reason not to rule it
illegal.  In fact, we would need to do something like not using the
least fixed point (which for F is `bottom') but some `higher' fixed
point.  That certainly _sounds_ quite messy...:)

For values we have the notion of a `suspended execution', and uses of a 
LETREC-bound variable within the right-hand sides of the binding
LETREC must only occur suspended.  Unfortunately, this property cannot 
be checked reliably at compile-time:

	(letrec ((ones (f 1 (delay ones))))
	  ...)

is legal only if F is bound to something like CONS and not to something like

	(lambda (x y) (force y))

Other languages (e.g. SML) adopt the policy of not permitting anything
but LAMBDA expressions in the binding initializers, which effectively
eliminates the problem.

Back to macros:  the `suspension' equivalent of macro expansion would
be to only allow invocations of the LETREC-SYNTAX-bound macros _in the 
output_ of the corresponding transformers.

This seems like a sensible solution, but it is somewhat
counterintuitive.

	(letrec-syntax
	    ((m1 (syntax-rules ...))
	     (m2 (m1 ...)))
	  ...)

would now look for M1's denotation in the environment the
LETREC-SYNTAX expression occurs in.  This `feels' more like a LET
rule, even though ultimately M2 can `refer to' M1 by outputting a
macro expansion that contains an invocation of M1.

The other possibility would be to disallow the above construction
altogether.  Fortunately, the condition under which this would have to 
happen is easily decidable.

But this brings us a step closer to a more relaxed (but also more
complex) condition: the definition of M2 could be allowed to use M1 --
as long as it doesn't also use M2 itself and as long as M1's
transformer is not specified using an invocation of either M1 or M2.
To formalize this we could say that it must be possible to order the
bindings by `numbering' them:

	0. There are n, n >= 0, bindings in the LETREC-SYNTAX.
	1. Every binding is given a unique ordinal i, 0 <= i <= n.
	2. The transformer specifications for binding i cannot
	   use invocations of macros defined by the same LETREC-SYNTAX
	   while bearing ordinals j, j >= i.
	3. All of the macros transformers bound by the same
	   LETREC-SYNTAX can use any of the MACROS so defined in their
	   output.

Since the last 4-part rule looks overly complex (even though I think
it isn't too hard to understand conceptually) we might want to not use 
it.

Example:

	(letrec-syntax
	  ((m1 (syntax-rules () ((_ x) x)))
	   (m2 (m1 (syntax-rules () ((_ x) x)))))
	  ...)

is legal (regardless of the context it appears in), while

	(letrec-syntax
	    ((m1 (syntax-rules () ((_ x) x)))
	     (m2 (m2 (syntax-rules () ((_ x) x)))))
	  ...)

is illegal (regardless of the context it appears in).  The
parethetical remark refers to a situation like:

	(letrec-syntax
	    ((m2 (syntax-rules () ((_ x) x))))
	  (letrec-syntax
	      ((m1 (syntax-rules () ((_ x) x)))
	       (m2 (m2 (syntax-rules () ((_ x) x)))))
	     ...))

which is _still_ illegal.  Effectively the LETREC-SYNTAX-introduced
bindings are all visible in all the right-hand sides, but some of them
are illegal to be used in some places.  The 4-point rule tells us
exactly when this is the case.

-------

I am more interested in getting rid of LETREC-SYNTAX and using LETREC
for all kinds of recursive bindings.  This is slightly tricky, since
it requires determining the nature of the right-hand side of a
definition.   [In my model of a low-level macro system all macros are
ultimately specified by some `primitive' transformer, for which there
is a special form.]  In order to do this we might have to expand a few
macros.  But for ordinary value definitions it must be possible to
invoke macros defined within the same LETREC, because otherwise it
would be pointless to mix the two together in the first place.

My last, complicated, 4-point rule could be extended by labeling value
defintions with `infinity' or such, therefore allowing invocations of
all macros there.

Note, that I am not asking you how one would implement any of this --
I think I have the algorithms to do it.  I'm trying to gather some
outside opinion and perhaps some pointers to related _design_
questions (preferably with answers :).

Opinions?  Suggestions?  Questions?  Hints?  Flames?

Best regards
-Matthias