[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

EQ?, again



Something that has not been mentioned in the debate over EQ? and EQV? as
applied to procedures is that one can use procedures as data
abstractions, including mutable data abstractions.  Thus if it's useful
to be able to distinguish two cons cells or vectors using EQ? and/or
EQV?, so that one can determine whether the effects of an operation on
one will necessarily be visible in the other, then why isn't it equally
reasonable to use EQ? and/or EQV? to distinguish items returned by the
following procedure?

(define (procedural-cons a b)
  (define (the-cons op . rest)
    (cond ((eq? op 'car) a)
	  ((eq? op 'cdr) b)
	  ((eq? op 'set-car!)
	   (set! a (car rest))
	   the-cons)
	  ...))
  the-cons)

Stating that it is an error or implementation-dependent to use EQ? and
EQV? on such procedures will complicate using procedures to do
object-oriented programming where you want to be able to check the
identity of objects (it doesn't rule it out altogether because you could
still wrap the procedure representing the real object in a cons cell
whose purpose was to satisfy identity checks!).  Are the kinds of
optimizations we want to do on procedures really important enough to
warrant throwing away this convenience?  Alternatively, would specifying
a tighter semantics for EQ? and EQV? make the optimizations effectively
impossible, or would it just require a little more data flow analysis
during compilation to determine that the procedures in question would
never find their way to be operands of EQ?  Perhaps somebody who has
thought about such optimizations and the kinds of programs in which we
would like to make them could comment.

In the absence of such input, I think my answers to questions 1-4 are
therefore

F    1. (EQ?  (LAMBDA (X) X) (LAMBDA (Y) Y))		;"Coalescing"
F    2. (EQV? (LAMBDA (X) X) (LAMBDA (Y) Y))
T    3. (LET ((X (LAMBDA (Z) Z))) (EQ?  X X))	;"Splitting"
T    4. (LET ((X (LAMBDA (Z) Z))) (EQV? X X))

Questions 5 and 6 get to the question of whether certain immutable data
types, notably numbers and characters, are subject to splitting when
viewed by EQ? or not.  I am coming around to the viewpoint expressed by
others that the distinction between EQ? and EQV? is a historical
artifact that should probably be removed.  The major expense associated
with that would be that, in most implementations, EQ? could no longer be
a simple pointer comparison, but would now have to investigate the type
of the objects being compared.  Thus I can see the argument for keeping
EQ? for those cases where especially efficient execution is desired and
the programmer knows that none of the data types that can "split" when
viewed by EQ? are involved.  But the price of this is the kind of
proliferation of other functions exemplified by the MEMQ/MEMV/MEMBER
family, which I think is unfortunate.  Anyhow, I think my first vote
would be to flush EQ? and just keep EQV? (or else rename EQV? to EQ?).
But if we keep EQ? as a separate function, then in line with the
rationale expressed above, I think it should be an error to use EQ? on
things that may appear to split when viewed by EQ?.  Therefore:

E    5. (LET ((X ... any expression evaluating to an exact number ...))
	 (EQ? X X))

For (6) the question is whether we want to require all implementations
to treat characters in a way that is never subject to splitting.  I
don't really care much about this, though it seems that keeping
characters from splitting probably isn't too hard.  Therefore:

X    6. (EQ? #\X #\X)

where the X really represents either E or T.

I realize that my answers regarding EQ? on functions may make it
substantially more difficult to do obvious, simple optimizations that
avoid consing up a closure when, e.g., a function with no free variables
appears inside another function.  If this is too distasteful, perhaps we
should think about having two flavors of LAMBDA:  ordinary
lambda-expressions such as (LAMBDA (X) X) would produce values which
would be an error to feed to EQ? or EQV?, but (CONSED-LAMBDA (X) X)
would produce values whose behavior under EQ? and EQV? would be
dependable.  I'm not exactly wild about this idea, but it might offer a
way out and I thought I'd pass it along.  I think I prefer it to the EQ?
alternatives that have been more frequently suggested in response to
JAR's survey.

Regarding NAMED-LAMBDA, I too would be happy to flush it in favor of the
REC or LETREC forms.						-Bert