[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: n-ary functions rfc

To: Pavel Curtis <Pavel@parc.xerox.com>
Subject: Re: n-ary functions rfc
From: david carlton <carlton@husc.harvard.edu>
Date: Sat, 13 Jun 92 15:10:08 -0400
Cc: rrrs-authors@ai.mit.edu, carlton@ai.mit.edu
In-Reply-To: Your message of "Fri, 12 Jun 92 14:56:19 PDT." <92Jun12.145633pdt.58371@lambda.parc.xerox.com>
Organization: Citizens for Boysenberry Jam
On Fri, 12 Jun 1992 14:56:19 PDT, Pavel Curtis <Pavel@parc.xerox.com> said:

> Before I respond to this suggestion, let me point out that there are
> several reasons why it is better to represent n-ary function
> arguments with a function rather than a list, even aside from the
> potential for optimization:

> 	* Procedures are a more fundamental data type in Scheme than lists are;
> 	  we have special expressions for creating and using them.  Aside from
> 	  history, the choice of lists for the representation of extra
> 	  arguments seems not just arbitrary, but foolish: vectors are clearly
> 	  more efficient to allocate, initialize, and access, regardless of the
> 	  order of the accesses.

I find it a lot easier to program using lists than vectors, and
studies seem to show that I'm not alone in that.  I don't have to keep
track of all of these extra indices floating around, I don't have to
keep the length of the vector around somewhere, and so forth.  And I
don't agree that procedures are more fundamental than lists, either,
though I don't feel like arguing that here.

> 	* Procedures are immutable, making them easier to reason about than
> 	  lists; even if the procedure encapsulating the extra arguments
> 	  `escapes' to an unknown context, the reader of the program can rest
> 	  assured that no change in its `contents' will take place.

Are they really easier to reason about?  You can't even tell whether
or not two of them are equal; if you have a random procedure floating
around, you can't tell what sort of arguments it takes; and so forth.
Lists, by contrast, are very simple - they're either empty or they
have a car, which is the head of the list, and a cdr, which is the
tail of the list.  A very simple data object, at least compared to
procedures.  Immutability is great, but being a black box isn't.

> 	* Procedures are a more flexible representation of a sequence of
> 	  values; they can, for example, be composed with one another to
> 	  compute a new, modified version of a given sequence in constant time,
> 	  regardless of the number of elements.  Also, the computation of the
> 	  elements can be done lazily, regardless of the order in which they're
> 	  requested.  Having the extra arguments represented as a procedure
> 	  makes it very convenient to use the expressive power of the FAPPLY
> 	  function included in my proposal.

I agree that they are more flexible.  On the other hand, if you are
going to start representing sequences as functions, you have to write
all sorts of conversion functions, make sure that you always convert
them before printing them out, and so forth.  If you want this, I
would think that it would be better to add lazy lists to the language
or to put in calls to delay and force, or something.

> Now, as for optimizing the use of a list holding the extra
> arguments, I have a couple of points to make.

> First (and I realize that this isn't what you're suggesting, David),
> many brash Lisp compiler hackers have for years claimed that it
> should be `straightforward' to optimize the use of CAR, CDR, and
> PAIR?/NULL? in dealing with rest lists.  `All' you have to do is a
> little data flow analysis to maintain extra integer index variables
> along with the various sublist values floating around.  Then a use
> of CAR can, in theory, simply be transformed into an operation using
> the corresponding index to access the value on the stack.  As far as
> I know, however, not a single one of these brash compiler hackers
> (myself included, since I also used to make this claim) has ever
> succeeded in implementing such an optimization.

> Even if they had, though, my second point would stand.  (I do
> understand that performing the optimization for the LENGTH/LIST-REF
> case is much easier.)  That is, it isn't portable for me to write
> code that counts on such an optimization being made, since surely
> not all compilers will do so.  But how, I hear you ask, can my code
> `count' on an optimization?

> Well, clearly the presence or absence of an optimization can't
> change the final outcome of my program, but it certainly *can* have
> so great an effect on the performance of my program as to render it
> useless in non-optimizing implementations.

> For example, consider this implementation of n-ary addition:

> 	(define (+ . args)
> 	  (let ((len (length args)))
> 	    (case len
> 	      ((0) 0)
> 	      ((1) (list-ref args 0))
> 	      (else
> 	       (do ((i 1 (bin+ i 1))
> 		    (sum (list-ref args 0)
> 			 (bin+ sum (list-ref args i))))
> 		   ((= i len)
> 		    sum))))))

> Now, in an implementation that supported the optimization you're
> suggesting, David, this code would probably run quite nicely and
> might even be worth considering for the `official' implementation of
> + in that system.

> If this code is ported to a different implementation, though, in
> which the optimization is *not* provided, then the function performs
> abominably: it conses wildly (and uselessly) and, much worse, it is
> no longer even *linear* in the number of arguments (because list-ref
> is not a constant-time operation)!

> This unavoidable kind of performance non-portability (changing not
> just the constant factors in my program's running time, but the
> asymptotic bound) is, I believe, unacceptable.

This is of course the point - while it is relatively easy to do the
optimization on (LENGTH x) and (LIST-REF x i), you can't guarantee
that compilers will do it.  However, I think that what your proposal
boils down to, at least in situations where your index function isn't
passed elsewhere, is to add syntax which has the effect of mandating
that compilers do something that is better than what would happen if
they didn't optimize LENGTH and LIST-REF and which, if they do
essentially the same optimization in either case, gets identical
results.  I don't like the idea of adding new syntax in order to
guarantee an optimization - if that's what you want, then leave the
syntax as it is and mandate the optimization, or rather document the
LENGTH/LIST-REF idiom and encourage compilers to optimize it.

Of course, the above paragraph is unfair, because it ignores the
increased generality that your proposal provides.  I agree that the
generality certainly is increased, but I'm not convinced that it would
be used in practice or that it wouldn't be better provided by some
other method.

When do people use rest-lists?  I use them in two situations.  One of
them is to add optional arguments - so for example if I were writing
DISPLAY, I would do something like:

(define DISPLAY
  (lambda (object . port)
    (let ((real-port (if (null? port)
                         (current-output-port)
			 (car port))))
     ....)))

In situations like this, there is only a finite number of optional
arguments, so time considerations aren't an issue - no matter how you
implement things, as rest-lists or functions or rest-vectors, access
time is O(1), and which method will be faster in practice can vary
from implementation to implementation.

The other situation in which I use rest lists is when I really do want
to be able to handle an arbitrary number of arguments.  Your '+'
function is an excellent example of that kind of function.  But I
wouldn't consider implementing it in the way that you did above (and
I'm not saying that you would either as anything other than an
illustration) - I would say

(define +
  (lambda numbers
    (let loop ((numbers numbers)
               (val 0))
      (if (null? numbers)
          vall
          (loop (cdr numbers) (bin+ (car numbers) val))))))

This works in linear time, is a very simple definition, and is the
kind of thing that anybody who programs much in languages with lists
will come up with easily.   So here too I see no benefit in adding
"arbitrary" to the language.

So the questions that I have are:

1) Is the above breakdown of uses of rest-lists into either optional
arguments or arbitrary numbers of arguments a reasonable one?

2) Is '+' a typical member of the latter class of uses?  I think that
it probably is not too unreasonable - I have a hard time imagining
wanting to write a function that would want to do the sort of random
access on its elements that would make some other representation more
desirable.

3) Does this really come down to either adding syntax to make
something more efficient or documenting an equivalent idiom and
encouraging programmers to use it and implementors to optimize it, or
are there really situations where the increased generality of
"arbitrary" comes in handy?

And the most important:

4) Could you please send me examples of the code that made you want to
add this to the language?  I could well believe that I could be
convinced by seeing a single function definition that strikes me as a
natural example and that clearly benefits from this.

david carlton
carlton@husc.harvard.edu

       I'm changing the CHANNEL..  But all I get is commercials for
       ``RONCO MIRACLE BAMBOO STEAMERS''!
References:
- Re: n-ary functions rfc
  - From: Pavel Curtis <Pavel@parc.xerox.com>
Prev by Date: Re: n-ary functions rfc
Next by Date: n-ary functions rfc
Prev by thread: Re: n-ary functions rfc
Next by thread: n-ary functions rfc
Index(es):
- Date
- Thread