[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Scheme String Operations: the Report

To: David Bartley <Bartley%ti-csl.csnet@CSNET-RELAY.ARPA>
Subject: Scheme String Operations: the Report
From: CPH%MIT-OZ@MIT-MC.ARPA
Date: Wed, 16 Jan 1985 02:40 EST
Cc: Jensen%ti-csl.csnet@CSNET-RELAY.ARPA, Oxley%ti-csl.csnet@CSNET-RELAY.ARPA, Scheme@MIT-MC.ARPA
In-reply-to: Msg of 15 Jan 1985 15:45-EST from David Bartley <Bartley%ti-csl.csnet at csnet-relay.arpa>

    Date: Tuesday, 15 January 1985  15:45-EST
    From: David Bartley <Bartley%ti-csl.csnet at csnet-relay.arpa>

      -- We probably need a (CHAR? obj) predicate.  The following are useful;
    can we agree on their meaning?

    	(CHAR->INTEGER char)	; CL's CHAR-CODE ?
    	(INTEGER->CHAR n)	; CL's CODE-CHAR ?

      -- How is a CHAR-SET represented?  Created?  Modified?

      -- Is the character datatype in MIT Scheme extended the same way CL has
    gone?  Do you support font and/or bit info?  Have you added operations to
    MIT Scheme beyond those you've proposed in your report?

In answer to all of these:  I purposefully limited my description of
characters (and character sets) to that minimum required to describe
the string abstraction.  I did this because I felt that the string
abstraction was very largely independent of the character abstraction;
this method shows the dependence pretty clearly.

Further, I recall that we agreed to do characters "the CL way" at the
workshop, at least so far as syntax.  My character datatype is heavily
influenced by CL, and implements bits but not fonts (I don't need them
now).  I would be willing to discuss both this and the character set
abstraction if there is interest.

      -- You were clearly influenced by Common LISP (CL)...

Not so!  The string operations were mostly designed from scratch; I
was unaware that they were so similar.

    ...CL uses the suffixes =, <, etc. for case sensitive
    comparisons of characters and strings and the suffixes EQUAL, LESSP, etc
    for case-insensitive comparisons.  I prefer to adopt that convention...

Mumble.  I don't particularly like those names, but then, I don't
particularly care about the names anyway.  I'm not terribly happy with
the names I chose, either.  Whatever folks like is fine with me.

      -- May I assume that string and character values are printed (and read)
    as defined by CL?

Yes; I think that we agreed upon this at the workshop.

      -- May I assume that all of the operations you defined are procedures,
    not special forms?

Yes!!! (with feeling)

      -- May I assume that SUBSTRING and other names written without ! always
    return a copy rather than sharing structure?

Yes.  Furthermore, operations written with a "!" return undefined values.

      -- Why distinguish STRING-ALLOCATE from MAKE-STRING?  By analogy with
    MAKE-VECTOR, shouldn't MAKE-STRING take the second argument optionally?

      -- One way to reduce the number of operations is to combine the STRING-
    and SUBSTRING- operations by accepting optional substring operands.

Gee, I guess that I don't feel too good about that.  While it is true
that this cuts down on the number of names, I have an irrational fear
of widespread optionalogy.  Perhaps it is from overexposure to
flagrant misuse, but I prefer not to use optionals at the "lower
levels" of my code.

Also, I was particularly thinking about the workshop, and I recall
there was very mixed feeling about optional arguments, rest arguments,
etc... I felt that this would be more acceptable to the community as a
whole.

      -- The order of arguments to CHAR-SET-MEMBER? is reversed from that of
    MEMBER, MEMV, and MEMQ.

Hmm... This was chosen to match other stuff... in particular,
VECTOR-REF, LIST-REF, etc.  The convention is: the compound structure
first, the key second.  In hindsight, it clearly should match MEMfoo.

      -- What should happen when indexes cross (start>end) or go outside the
    proper range?

Sorry, I should have spelled that out.  It will signal an error, in
all cases (although maybe not at the most reasonable place, in my
implementation).  I specific: everywhere where I have described the
arguments as "a string", "a character", "a substring", etc., those can
be construed as requirements.  Errors will happen if those things are
not true.

      -- Are you suggesting that the Basic Character Operations, Basic String
    Operations, and Standard Operations be 'required' and the rest be
    'optional'?  If not, where would you draw the line?  Do you use all of
    these operations yourself?

No, I was making no such suggestion; I would be hesitant to do so
without some discussion.  I do feel that the 'Basic String Operations'
and 'Standard Operations', with the exception of STRING-ALLOCATE,
STRING-SET!, and STRING-FILL!, are very useful, and perhaps should be
required.  But then we have already agreed upon those in the 'Basic'
category, and I think that there would be little disagreement about
the 'Standard' ones.

But I can't really say where one should draw the line; all of these
procedures are useful if one ever does anything even moderately hairy.
Personally, I have used most of these operations pretty freely in the
editor.

I have found, though, that a particular pattern has emerged.  The
mutating operations are used (in conjunction with STRING-ALLOCATE)
almost exclusively for the construction of higher level, non-mutating
operations like SUBSTRING and STRING-APPEND.  I think that such things
might safely be relegated to the realm of 'system programming', for
those who prefer to make such a distinction.

Anyway, if there is interest in working out the boundaries of
'required' vs. 'optional' here, I am perfectly willing to add more of
my flame to the conflagration.

Prev by Date: Re: Scheme String Operations: the Report
Next by Date: Purpose of a "common" Scheme
Prev by thread: Re: Scheme String Operations: the Report
Next by thread: Re: policy to adopt
Index(es):
- Date
- Thread