[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Pattern matching, not optional arguments



   Date: Wed, 15 Apr 87 12:41:00 EDT
   From: Jonathan A Rees <JAR@AI.AI.MIT.EDU>

   Phil Wadler recently complained in SIGPLAN that Scheme has no built-in
   pattern matching like ML and its affiliates do.  I think there may be
   something to what he says.  

Yes indeed.  Phil's article was one of my motivations to push for
pattern matching (the other was what I saw happening to lambda).  Some
of the ways in which Phil finds Miranda superior to Scheme I disagree
with, feeling they are simply different; for example quote.  I'd like
to see strong type checking widely used in Scheme, but that is a very
difficult matter.  The one respect in which I feel Miranda is clearly
superior, and that we can address with relatively little effort, is
pattern matching.

   Yours treats lists and vectors
   asymmetrically, and doesn't allow for additional constructors without
   redefining the macro.  You should consider going even closer to ML,
   which you could do by changing the last two alternatives to be

	       | (list <pattern> ...)
	       | (list* <pattern> <pattern> ... <pattern>)

Did you mean | (list* <pattern> <pattern> . <pattern>) ?

   (In ML you'd write <pattern>,<pattern>,... instead of
   (list <pattern> <pattern> ...), and <pattern>::<pattern> instead of
   (cons <pattern> <pattern>).)

So let's also add | (cons <pattern> <pattern>), which is the same as
		  | (list* <pattern> . <pattern>).

   This eliminates the reserved word problem; it is possible to take a list
   apart and name its car VECTOR, and it is possible to make MATCH
   understand new kinds of constructors without causing old code to break
   due to name conflicts.  

Another way to eliminate the reserved word VECTOR would be to replace
the vector line with 

		| #(<pattern> ...)

This is consistant with the rest of the approach I took, which is to
have patterns match the print syntax, rather than the construction
expression syntax, as you would have it.  (I recall originally wishing
I could use the vector syntax above, but being forced to the VECTOR
keyword syntax because I was working in Scheme84 at the time and it
does not support the standard vector notation.  Sorry I neglected to
correct this damage before making the proposal public.)

The use of construction pattern syntax certainly is an alternative worth
consideration.  It also has the big advantage that quote becomes
completely consistent with the pattern style; e.g. (list 'x var) is
more natural than ('x var).  

The ability to add new constructors in a smooth way if constructor
syntax is used is potentially a big win.  But to do it right
(something like the clarification of your proposal below) complicates
the match mechanism substantially.  It also makes it clunkier to use
when only lists and vectors are involved, which I expect will be the
vast majority of the time.  It's much like the difference between
making things with list and cons vs. using quasiquote.  (Actually,
quasiquote might be used to make constructor syntax patterns.  Argh!)
If matching is to be much used for such things as optional arguments
(where only lists are involved), the simplicity of the print syntax is
a real plus.  It's a hard choice.

   In fact, if you changed
     (? <predicate> <pattern>)
   to
     ((? <predicate>) <pattern>)
   then you could change the syntax to be simply

       <pattern> ::= <variable>
	       | <number> | <string> | (quote <object>)
	       | (<expression> <pattern> ...)

Then what would be the meaning of the second and subsequent
subpatterns in a pattern of the form ((? <predicate>) <pattern> ...)?
The predicate should see an arbitrary value that can then be
destructured with a single pattern.

   This would permit you to say things like

	 (let ((widget list)) ...
	   (match w
	     ((widget wrench connector) ...)))

   to take a widget (implemented as a list) apart into its wrench and
   connector components, and we have taken another step closer to being
   like ML.  Giving a semantics for this in Scheme is harder than in ML
   since Scheme is a dynamically-typed language with only one namespace,
   and ML is statically typed and has four namespaces.  The feat can be
   accomplished by making MATCH expand into a call to some procedure, call
   it MATCH-INTERNAL:

	   (match x ((exp pat ...) body ...) clause ...) ==>

	   (let ((z x))
	     (match-internal exp number-of-pats-following-exp z
		(lambda (val ...) (match val ...) ...)
		(lambda () (match z clause ...))))

The (lambda (val ...) ...) success continuation above doesn't quite
work.  The the success continuation should be passed a fail
continuations (the same as the one passed to match-internal), and body
... should be within the scope of the pattern variables in each of the
subpatterns.  Match can also be implemented by expanding the tests
inline, which results in considerable code expansion and slower
compilation, but runs faster.  We've used both approaches at Indiana.

To provide for user defined constructors, some mechanism is needed for
associating a predicate and destructuring function with each
constructor.  This could be done in several ways, such as a global
(hash?)table or using constructor objects that responded to messages
asking for their associated predicate and destructuring functions.  If
constructors are regularly created in dynamic ways, the global table
presents garbage collection problems.  But let's not get carried away
with implementation concerns at this stage.

In any event, I suggest that the destructuring functions take two
arguments: a function to receive the component values and the value to
be destructured.  Thus APPLY would be the list destructuring function.

   (lambda (f v) (apply f (vector->list v))) 

could be used for vectors, but some implementations might provide a
more efficient version that didn't cons up a list.

If match is to support user supplied constructors, we had better be
convinced that they will be regularly used and that it is important
for them to be abstract.  If abstraction isn't critical, one can
always use lists or vectors that begin with certain flags to
distinguish objects created by new constructors.  For example, using
this old trick the widget constructor might be

   (define widget
     (lambda (wrench connector)
       (list 'widget wrench connector)))

A widget pattern (in print syntax) would then be

   ('widget wrench connector)

and no widget destructuring function is needed.  Our biggest use of
match so far has been doing this sort of thing to represent abstract
syntax.

   By the way, if numbers and strings can be patterns, then booleans and
   characters ought to be patterns too.

Yes.  This oversight was also a throwback to the original Scheme84
implementation of match.

   Jonathan

Chris