[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Scheme's DO construct



Hmm, I seem to have generated some controversy.  I'll attempt to
provide clarifications and responses separately on each topic to
help keep the discussions distinct.  I'll start with "DO."
------------------------------------------------------------------------
	I cannot refrain from observing that DO is truly ugly.... [asc]

	I strongly disagree.  DO is a construct that emphasizes the 
	notion that an iteration can proceed by initializing some state 
	variables and then repeatedly transforming them while maintaining 
	some invariant until a condition is reached. ... [gls]


I think Guy misguessed the source of my displeasure with DO. So I'll
clarify. (Warning: not a short message.)

I have no prima facie objection to the use of iterative constructs,
and in particular, none to supporting iteration in Scheme.   Indeed, I think 
that it is often very effective to think procedurally and iteratively rather
than functionally and recursively.  (I understand these are apples and 
oranges, but the styles are usually associated with each other.)  

There may be interesting extremist arguments in favor of relying solely on 
recursion for repetition,  presumably based on strict minimalism and some 
arguments concerning the elegance of recursion.  But I probably wouldn't buy
them, and in any case that's not what my complaint is about.  If we were going
to be minimalism extremists, we wouldn't provide IF, COND, and CASE in the same
language.  Instead, I think we are trying to design an effective programming
environment while remaining "reasonably" elegant.  The touchstone for the
appropriate degree of minimalism is that we provide the constructs that
a programmer will need, separating functionality out when the programmer's
abstraction of the control process is different (as opposed to merely the
underlying computation being different; hence CASE, IF, and COND).

So given the assumption that we are going to have one or more iteration
constructs, what should they look like and how should we select them?
My claim is that the DO we are using is a poor choice, and moreover,
that it probably reflects a correspondingly poor choice in selection criteria.

In a nutshell, DO does too much.  It creates variables, binds them,
provides iteration, rebinds variables at tactically pleasing times,
tests a conditional expression, applies an implicit SEQUENCE to most
of its arguments, returns the result of evaluating an optionally present
expression... whew!

I submit that we have CommonLisp's DO (which is in use in other LISPS too,
of course) principally because it was being used by other LISPs and no one
really stopped to think what a good iteration construct for Scheme would
look like starting from first principles.  We just grabbed what (some) (LISP)
programmers currently are using without treating this as an important
design problem.

I also submit that DO is too complex and that this complexity violates
two fundamental principles of good programming language construct design:

1. Keep constructs simple so people can learn, understand, read, debug, 
   and maintain them easily.
2. Each function should do one thing, and do it well.  (No pun intended.)

Just to take a few pot shots at DO: we don't need variable binding and
creation, since we already have the LET family to do that nicely for us.
We don't need to confuse conditional exit from an iteration (i.e. using
a predicate expression) with FOR-class iteration that occurs, e.g.,
a predictable number of times and/or over a predictable set of values.
Certainly it's nice to have all these things around when you want them;
my point is simply that an iteration construct doesn't need to provide
all of them, that we can make intelligent choices about which ones
a given construct is to provide, and that if DO needs to contain all of
them, then the burden of proof is on those who would make that claim.

To reframe the challenge, the question is not whether you can come up
with times when you need conditional exit, and times when you need
FOR-style iteration, and... etc.; but rather, whether you can convincingly
demonstrate that all are needed in one iteration construct that appears in
the language definition.

Let me pose a few alternatives.  Bear in mind that these are strawman
options, for illustrative purposes. If you don't like keywords in your
control constructs, imagine the examples without them.

(FOR i IN <list> DO <forms>) -- Evaluates <forms>, to which an implicit
SEQUENCE is applied, with i bound to successive values of <list>.

[This is like FOR-EACH except that you have the current value bound to
a lexically apparent locally-created locally-bound variable i.  This is about
as similar to FOR-EACH as, say, COND and CASE are to each other.]


(VARYING i FROM n TO m BY s DO <forms>) -- Binds i to n, n+s, n+2s, ...
and evaluates <forms>, to which an implicit SEQUENCE is applied, for
each such binding, until i>m.  Returns the result of the last evaluation
of <forms>.

[This is the classic FOR-loop construct used when you know the range
of values in advance, e.g. when you are stepping through a vector.]


(REPEAT n TIMES <forms>) -- Evaluates <forms>, to which an implicit
SEQUENCE is applied, n times. Returns the result of the last evaluation
of <forms>.

[Not frequently used, but highly perspicuous on those occasions when
this is exactly what you want to do.]


(WHILE <test> DO <forms>) -- Repeatedly evaluates <forms>, to which an
implicit SEQUENCE is applied, until <test> evaluates to #F, like
Pascal's WHILE.

[There's some interesting research by Jeff Bonar, among others,
with empirical results suggesting that Pascal's WHILE loop is actually
a dangerous, hard-to-learn construct.  This is because (a) the classic
phase problem that occurs when you use a WHILE loop in a PROCESS,READ
paradigm is harder to learn than the more intuitive READ,PROCESS model,
and (b) beginners often think that WHILE really means "while," i.e.
"whenever <test> is true, do <forms>" or "the moment <test> would become
false, exit the loop."  I also dislike the cooption of this word for this
iterative processing application, since its more intuitive meanings
actually might be useful in concurrent processing applications.  In any
case, when not used in a PROCESS,READ paradigm, the construct (whatever
its name) is often quite useful.]


Hmm, that's a lot of extra stuff in a language to take the place of
one DO construct, isn't it?  But that's precisely the point.  DO is
a microcosm of MacLisp's and, by inheritance, Common Lisp's invasive 
featurism.  In the extreme, we could create a function F(x) and specify
which entire program we wanted simply by feeding F the proper number.
Many people would regard such a Goedlization of programs as a positive
step, if only they could determine which number to feed F. But: we wouldn't
call it programming; we wouldn't call F a "control construct" in the
sense that we normally use the word; and perhaps most importantly, we
wouldn't know how to meaningfully specify the x.  Forgiving the hyperbole,
that is also true of DO: it is an expert's construct with options sticking 
out all over and no coherent design principles underneath, hard to learn 
and remember and even harder to read and understand quickly on sight.

I have performed the following informal experiment.  I had one or
two of the people working with me who were true devotees of LOOP
and DO go back and recode fairly large (5000-15000 lines) LISP programs,
ripping out LOOP and DO and putting in DOLIST, DOTIMES, and mapping
functions.  I should say that they did it "kicking and screaming," not
because of the work involved but because they so loved all the features
that LOOP offered.  But a couple weeks later, they came back and said:

"You know, LOOP is really evil!"
"My code is infinitely easier to read and maintain now."
"I had no idea how ugly and impenetrable all that DO and LOOP code was
 from all those `features' of DO and LOOP I was using."
"You have to have something seriously wrong with your head to be able to
 use DO fluently."
"I'm not going to use LOOP again!"

I should say that the principal guy I have in mind here got his degree at
MIT.  We're talking the hardest of sells, and when the dust had cleared,
he came to the conclusion that DO and LOOP are morally reprehensible.

We need to remember that programs are written not only to be executed,
but also to be read; and that well over 50% of programmer time is
spent maintaining existing code.  Clear, clean, well-designed control
constructs can go a long way towards easing this burden.   DO does not
seem to qualify as such a construct; rather, it seems like the incidental
union of numerous such constructs.  We might be able to make a major
contribution to the utility of LISP in "practical" programming applications
by updating Scheme's iteration constructs so they provide better, cleaner,
more direct support for the variety of iteration models that are already
implicit in the code we write.
						asc