[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Regularization of Procedures in Scheme



For those of you who don't remember, I was assigned the task at L&FP to
regularize the procedures involving lists, strings, and vectors for R4RS.
Initially, this looked like a trivial, Noncontroversial task; but, it turns out
that there are actually some important semantic issues to be discussed.  I have
therefore decided to pass my ideas before the entire community, rather than
just sending them straight to the editors.  

I will begin this discussion with what I believe are the Noncontroversial
changes and then proceed to the Controversial ones.  In order to better
understand the issues I have included the chart below which enumerates the
relevant procedures currently in R3RS.

LIST                STRING              VECTOR
----                ------              ------
list                                    vector
                    make-string         make-vector
                    string-fill         vector-fill
list-ref            string-ref          vector-ref
                    string-set!         vector-set!
                    string-copy
length              string-length       vector-length
append              string-append
append!
list->string        string->list        vector->list
list->vector
reverse
                    substring
                    string=?
                    string?             vector?
apply
map, foreach
memq, memv, member
assq, assv, assoc

Noncontroversial:
1)  Add string, make-list, list-fill, list-set!, list-copy, vector-copy,
    vector-append, string->vector, and vector->string with the obvious
    semantics. 
2)  Create aliases list-length and list-append for length and append,
    respectively. 
3)  Append! should NOT be generalized since most implementations (for good
    reasons) represent strings and vectors in a manner that is incompatible
    with append! semantics.
4)  Reverse should NOT be generalized since the extensions would be pretty much
    useless.  
5)  Sublist and subvector should be added because they are potentially useful
    and the implementations in scheme are much less efficient than ones which
    could be supplied by an implementor.
6)  String=? is only present for symetry with string-ci=? and should NOT be
    generalized to list=? and vector=? since we already have =?.

Controversial:
1)  Create list? which checks for a null terminated list.  This CAN be written
    by a user and is only marginally useful, but I believe it is important that
    it exist for symetry reasons.
2)  If we keep both syntaxes for apply
       (apply proc args)
       (apply proc arg1 arg2 ...  args)
    then this can't be generalized.  However, I believe that we should restrict
    ourselves to the first syntax ONLY.  Then args could be any of a list,
    string, or vector whose elements would be spread before the actual
    application was performed.

3)  There are several possibilities for map and foreach:
    a)  Status quo
    b)  Introduce list-map, string-map, vector-map, list-foreach,
        string-foreach, and vector-foreach.  List-map and list-foreach would be
        identical to the current map and foreach, respectively. The others
        would have the following syntaxes and semantics
    	    (string-map proc string1 string2 ...)
	    (vector-map proc vector1 vector2 ...)
   	    (string-foreach proc string1 string2 ...)
	    (vector-foreach proc vector1 vector2 ...)
	The string version would take strings as arguments and in the case of
        string-map would return a string.  The vector versions would do
        similarly.  It would be a type error to pass a string to list-map, a
        list to vector-map, etc.
    c)  The same 6 procedures would be introduced as in b).  In this case the
        name would specify the type of the return value.  The procedures would
        be polymorphic over arguments of type list, string, and vector.  All
        that would be required would be that the arguments all have the same
        length. 
    I am personally in favor of either b) or c).  I like the semantics of c),
    but realize that some implementors may find its semantics objectionable.
4)  Memq, memv, and member are quite problematic.  They could be extended to
    allow the second argument to be a vector in the obvious way, but then what
    should they return on a match.  Returning a subvector is unsatisfactory
    because the aliasing one gets with a sublist would undoubtedly not be
    present for a subvector, and thus the original structure could not be
    modified based on the result.  In my experience, this is one of the most
    common uses for these procedures and it would be negated.  Another
    alternative is just to return #t and #f.  However, these seems to be
    unintuitively incompatible with the original versions.  Extensions for
    strings are even worse, because the equivalence distinctions between eq,
    eqv, and equal are not present and have instead been replaced by case
    sensativity and insensitivity.  MY RECOMMENDATION IS TO MAKE NO EXTENSION.
5)  Any extensions to assq, assv, and assoc have similar problems to those of
    memq, etc. and I therefore again RECOMMEND NO EXTENSION.
6)  As a result of 5) and 6), I suggest adding two new families of procedures
    which for the sake of exposition I will call member? and match.  (I know
    these are horrible names, but if people like the idea, we'll worry about
    what to call them later.)  Their syntaxes would be as follows:
       (list-member? val pred list)
       (string-member? val pred string)
       (vector-member? val pred vector)
       (list-match val pred list)
       (string-match val pred string)
       (vector-match val pred string)
    The member procedures would return #f if (pred val (foo-ref foo i))
    returned #f for all i, where foo is either list, string, or vector.
    Otherwise, they would return #t.  The match procedures would perform
    similarly except that they would the first element for which pred returned
    #t, #f otherwise.  A decision would have to be made for match as to whether
    each element of a vector would have to itself be a vector.  Obviously, for
    strings each element would have to be a character and so this version would
    be somewhat useless.  
    As A RESULT, I propose that instead we introduce just the procedures
    member? and match which are polymorphic over all the reasonable types
    (i.e., strings, vectors, and lists).  For match, this polymorphism would
    extend to 2 levels into the data structure.

    					Morry Katz
					katz@polya.stanford.edu