[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Regularization of Procedures in Scheme
For those of you who don't remember, I was assigned the task at L&FP to
regularize the procedures involving lists, strings, and vectors for R4RS.
Initially, this looked like a trivial, Noncontroversial task; but, it turns out
that there are actually some important semantic issues to be discussed. I have
therefore decided to pass my ideas before the entire community, rather than
just sending them straight to the editors.
I will begin this discussion with what I believe are the Noncontroversial
changes and then proceed to the Controversial ones. In order to better
understand the issues I have included the chart below which enumerates the
relevant procedures currently in R3RS.
LIST STRING VECTOR
---- ------ ------
list vector
make-string make-vector
string-fill vector-fill
list-ref string-ref vector-ref
string-set! vector-set!
string-copy
length string-length vector-length
append string-append
append!
list->string string->list vector->list
list->vector
reverse
substring
string=?
string? vector?
apply
map, foreach
memq, memv, member
assq, assv, assoc
Noncontroversial:
1) Add string, make-list, list-fill, list-set!, list-copy, vector-copy,
vector-append, string->vector, and vector->string with the obvious
semantics.
2) Create aliases list-length and list-append for length and append,
respectively.
3) Append! should NOT be generalized since most implementations (for good
reasons) represent strings and vectors in a manner that is incompatible
with append! semantics.
4) Reverse should NOT be generalized since the extensions would be pretty much
useless.
5) Sublist and subvector should be added because they are potentially useful
and the implementations in scheme are much less efficient than ones which
could be supplied by an implementor.
6) String=? is only present for symetry with string-ci=? and should NOT be
generalized to list=? and vector=? since we already have =?.
Controversial:
1) Create list? which checks for a null terminated list. This CAN be written
by a user and is only marginally useful, but I believe it is important that
it exist for symetry reasons.
2) If we keep both syntaxes for apply
(apply proc args)
(apply proc arg1 arg2 ... args)
then this can't be generalized. However, I believe that we should restrict
ourselves to the first syntax ONLY. Then args could be any of a list,
string, or vector whose elements would be spread before the actual
application was performed.
3) There are several possibilities for map and foreach:
a) Status quo
b) Introduce list-map, string-map, vector-map, list-foreach,
string-foreach, and vector-foreach. List-map and list-foreach would be
identical to the current map and foreach, respectively. The others
would have the following syntaxes and semantics
(string-map proc string1 string2 ...)
(vector-map proc vector1 vector2 ...)
(string-foreach proc string1 string2 ...)
(vector-foreach proc vector1 vector2 ...)
The string version would take strings as arguments and in the case of
string-map would return a string. The vector versions would do
similarly. It would be a type error to pass a string to list-map, a
list to vector-map, etc.
c) The same 6 procedures would be introduced as in b). In this case the
name would specify the type of the return value. The procedures would
be polymorphic over arguments of type list, string, and vector. All
that would be required would be that the arguments all have the same
length.
I am personally in favor of either b) or c). I like the semantics of c),
but realize that some implementors may find its semantics objectionable.
4) Memq, memv, and member are quite problematic. They could be extended to
allow the second argument to be a vector in the obvious way, but then what
should they return on a match. Returning a subvector is unsatisfactory
because the aliasing one gets with a sublist would undoubtedly not be
present for a subvector, and thus the original structure could not be
modified based on the result. In my experience, this is one of the most
common uses for these procedures and it would be negated. Another
alternative is just to return #t and #f. However, these seems to be
unintuitively incompatible with the original versions. Extensions for
strings are even worse, because the equivalence distinctions between eq,
eqv, and equal are not present and have instead been replaced by case
sensativity and insensitivity. MY RECOMMENDATION IS TO MAKE NO EXTENSION.
5) Any extensions to assq, assv, and assoc have similar problems to those of
memq, etc. and I therefore again RECOMMEND NO EXTENSION.
6) As a result of 5) and 6), I suggest adding two new families of procedures
which for the sake of exposition I will call member? and match. (I know
these are horrible names, but if people like the idea, we'll worry about
what to call them later.) Their syntaxes would be as follows:
(list-member? val pred list)
(string-member? val pred string)
(vector-member? val pred vector)
(list-match val pred list)
(string-match val pred string)
(vector-match val pred string)
The member procedures would return #f if (pred val (foo-ref foo i))
returned #f for all i, where foo is either list, string, or vector.
Otherwise, they would return #t. The match procedures would perform
similarly except that they would the first element for which pred returned
#t, #f otherwise. A decision would have to be made for match as to whether
each element of a vector would have to itself be a vector. Obviously, for
strings each element would have to be a character and so this version would
be somewhat useless.
As A RESULT, I propose that instead we introduce just the procedures
member? and match which are polymorphic over all the reasonable types
(i.e., strings, vectors, and lists). For match, this polymorphism would
extend to 2 levels into the data structure.
Morry Katz
katz@polya.stanford.edu