[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Pavel.pa: A Proposal for Environments in Scheme]



A number of people seem to have not received this, so I am re-sending it.
- Jonathan

Date: Fri, 20 May 88 11:49:26 PDT
From: Pavel.pa at Xerox.COM
To:   rrrs-authors at mc.lcs.mit.edu
Re:   A Proposal for Environments in Scheme

I'd like to offer the following proposal for a facility for first-class
environments in Scheme.  We have found it to work quite cleanly and powerfully
in Cedar Scheme and thus believe that it is worthy of consideration for
inclusion in the standard language.

	Pavel Curtis
	Xerox PARC/CSL

INTRODUCTION

If Scheme is to grow and be used by more people, one of the problems it must
solve is the robust isolation of different pieces of the system and user code
from each other.  The system described in this section is an attempt to cleanly
and simply solve this problem, with an eye toward making the addition of a true
file-compiler a simple extension with clean semantics.  It is fully implemented
in Cedar Scheme.

Note that this system does not attempt to address the questions of interfaces
vs. implementations, interface-version compatibility checking, etc.  We simply
want it to be possible for normal users to write code that is insulated
reasonably well from the system and from other users.

The presented solution owes a debt to the designers of T (in many obvious ways)
and probably those of other Scheme dialects.  As well, it is influenced by our
extensive experience programming in the Cedar language and environment, with its
heavy emphasis on well-defined interfaces.  The proposal given here is not,
however, truly the same as any other system of which we're aware; in particular,
it departs strongly from T in some fundamental ways, so draw no premature
conclusions.

The proposal begins by specifying a Scheme interface to first-class
environments, a cornerstone of the new facility.  It then describes the initial
set of environments in the system; this arrangement is in place >before< the
first file of code is LOADed.  Next, it suggests a syntax and semantics for
files of Scheme code that allows fine-grained but convenient control over the
environment structure.  Finally, it presents an example of a file using the new
facility.

The proposal proper is followed by some answers to questions that we anticipate
conerning the proposal.


FIRST-CLASS ENVIRONMENTS

We propose the addition to Scheme of first-class environment values.  These are
precisely the environments currently used by the Scheme interpreter with one
cosmetic addition, a human-readable >identifier< for the environment.  The
procedures defined on environments are as follows:

  MAKE-ENVIRONMENT <id> <parent> ...				[Procedure]

    Creates and returns a new environment with the given <parents> (if any) and
    the given identifying value, <id>, usually a string.  <Id> is strictly for
    debugging purposes; it is output as a part of the printed representation of
    the new environment.  The <parents>, on the other hand, are of semantic
    interest, since variable-lookup in the resulting environment will continue
    into the parents if the requested symbol has no binding in the child
    environment.  Variable lookup is depth-first and left-to-right among
    multiple parents.

  ENVIRONMENT? <object>						[Procedure]

    Returns true if and only if <object> is an environment.

  ENVIRONMENT-ID <env>						[Procedure]

    Return the value specified for the <id> parameter to the call to
    MAKE-ENVIRONMENT that created <env>, or #F if none was specified.

  ENVIRONMENT-PARENTS <env>					[Procedure]

    Return a list of the values specified for the <parent> parameters to the
    call to MAKE-ENVIRONMENT that created <env>.  It is an error to perform
    destructive operations on this list.

  ENVIRONMENT-REF <env> <symbol>				[Procedure]

    Return the value bound to <symbol> in <env> (or its ancestors), signalling
    an error if no such binding exists.

  ENVIRONMENT-SET! <env> <symbol> <value>			[Procedure]

    Change the value bound to <symbol> in <env> (or its ancestors) to <value>,
    signalling an error if no such binding exists.

  ENVIRONMENT-DEFINE! <env> <symbol> <value>			[Procedure]

    Change the value bound to <symbol> in <env> (NOT its ancestors) to <value>,
    adding such a binding to <env> if none exists.  Note that
    ENVIRONMENT-DEFINE! never affects any ancestor of <env>, only <env> itself.

  ENVIRONMENT-BOUND? <env> <symbol>				[Procedure]

    Return true if and only if there exists a binding for <symbol> in <env>
    (or its ancestors).

  WALK-ENVIRONMENT <fn> <env>					[Procedure]

    <Fn> should be a procedure of two arguments, a symbol and a value.  It is
    applied to every binding in <env> itself, NOT including bindings in its
    ancestors.

In addition to these procedures, We propose a change to the meanings of
identifiers whose names include colons:

  -- No identifier may begin or end with a colon.
     (Alternatively, such identifiers behave as they do now.)
  -- An identifier of the form a:b1:...:bk (k >= 1) is entirely equivalent to
     the expression
     		(ENVIRONMENT-REF a:b1:...:bk-1 'bk)

This change allows convenient reference to the values of bindings in an
environment that is bound to some variable in the current environment.  For
example, the identifier STRINGS:COPY is equivalent to the expression

		(ENVIRONMENT-REF STRINGS 'COPY)

and the identifier CEDAR:IO:RESET is equivalent to the expression

		(ENVIRONMENT-REF (ENVIRONMENT-REF CEDAR 'IO) 'RESET)


THE INITIAL ENVIRONMENT STRUCTURE

The initial structure of environments, as seen by the first file LOADed into the
system, consists of at least the following four environments:

  "SCHEME-ESSENTIALS"
  
    Contains exactly those bindings described as "essential" in the Scheme
    specification, with the semantics given there.  Thus, in this environment,
    APPEND takes exactly two arguments, LET does not allow the "named-LET"
    variant, internal DEFINEs are not allowed, etc.
    
    In addition, the name SCHEME-ESSENTIALS is bound to the environment
    itself.
    
    This environment is intended for use by those laboring under severe
    portability constraints.

  "SCHEME"
  
    Contains exactly those bindings described in the Scheme specification,
    except that a given implementation may not provide all of the optional
    features.  Thus, in this environment, APPEND may or may not accept an
    arbitrary number of arguments, LET may or may not allow the named variant,
    etc.  The strongest statement that can be made about this environment is
    that it certainly contains no more than what is described in R3RS and no
    less than the "SCHEME-ESSENTIALS" environment.
    
    In addition, the names SCHEME-ESSENTIALS and SCHEME are bound to the
    "SCHEME-ESSENTIALS" environment and this environment itself, respectively.
    
    This environment is intended for those who want to work in a reasonably
    portable setting with no extras.
  
  Some implementation-dependent name
  
    Contains whatever the implementation considers the bindings of the
    language it implements.  It should contain everything found in the
    "SCHEME" environment, though some may have been extended upward-
    compatibly.  In addition, other facilities, not described in R3RS,
    may be visible here.
    
    Some name should be bound to the environment itself.  In addition, the names
    SCHEME-ESSENTIALS and SCHEME are bound to the "SCHEME-ESSENTIALS" and
    "SCHEME" environments, respectively.
    
    It is assumed that most code written in the implementation will be
    evaluated in an environment descended from this one.
    
    In the rest of this proposal, this environment will be called the "full-
    language" environment.

  "GLOBAL"
    
    Has as its only parent the full-language environment.  In addition, the name
    GLOBAL is bound to this environment itself.
    
    The GLOBAL environment is distinguished in the definition of the semantics
    of code files, below.
    
    By convention, most files of code are evaluated in environments descended
    from this one and most other "public" environments will be accessible
    through names in the GLOBAL environment.

Implementation Note: In Cedar Scheme, for example, the initial environment
structure is created by code like the following:

       (let* ((scheme-essentials (make-environment "SCHEME-ESSENTIALS"))
              (scheme (make-environment "SCHEME" scheme-essentials))
              (cedar-scheme (make-environment "CEDAR-SCHEME" scheme))
              (global (make-environment "GLOBAL" cedar-scheme)))
   
          (environment-define! scheme-essentials
                                            'scheme-essentials
                                                          scheme-essentials)
          (environment-define! scheme       'scheme       scheme)
          (environment-define! cedar-scheme 'cedar-scheme cedar-scheme)
          (environment-define! global       'global       global)
          
          ...)

Though it is not required by this proposal, all four environments in the initial
structure in Cedar Scheme form a chain through the parent relation.  Only the
GLOBAL and full-language environments are required to be so linked.
[End note]


THE SYNTAX AND SEMANTICS OF FILES

In the usual case, a file of Scheme code contains the definitions of a few
variables (most frequently bound to procedures) intended for use by clients of
the software and several more ``helper'' variables intended only for use by the
package itself.  The variables to be externally available usually belong in a
separate space of names from those in any other package.  The following syntax
and semantics of files, which is upwardly-compatible with what is currently in
use in most dialects of Scheme, is intended to make the usual case easy while
allowing more complex cases to be handled smoothly as well.

A file of Scheme code, under this proposal, may optionally begin with a piece of
syntax, called a >herald<, that arranges for special treatment of the
environmental context of the code in the file.  Heralds obey the following
syntax:

		(HERALD <option>+)

where each <option> is one of the following:

  (ENV <expression>)
  
    The given <expression> is evaluated in the GLOBAL environment described
    above before loading the rest of the file.  <Expression> should yield an
    environment in which the rest of the file will be evaluated.  This option
    may only appear once, if at all.  It it does not appear, LOAD behaves as if
    the option
    
    		(ENV (MAKE-ENVIRONMENT <file-name> GLOBAL))
    
    had been given, where <file-name> is a string naming the file from which the
    code is being loaded.  This default is proper for most single-file packages.
    Larger applications might have an initial file that creates a new
    environment and binds it to some name in the GLOBAL environment; later files
    would specify an ENV option naming that environment.  In this way, a multi-
    file application can share a single environment among many files.

  (EXPORT <expression> <export-spec>+)
  
    An <export-spec> is either a simple identifier (one without colons) or a
    list of two simple identifiers; an <export-spec> that is a simple identifier
    <id> is exactly equivalent to the <export-spec> (<id> <id>).  The given
    <expression> is evaluated in the same environment as the rest of the file
    (see the description of the ENV herald option, above) after the rest of the
    file has been loaded.  It should yield an environment, say <env>.  For each
    <export-spec> (<id>1 <id>2), the environment <env> is side-effected such
    that the binding for <id>1 in <env> is identical to that for <id>2 in the
    environment for this file.  Two bindings are ``identical'' if a change to
    the value of one produces the same change to the value of the other.  In
    denotational terms, the two environments map the corresponding identifiers
    to the very same location.
    
    This option may appear as many times as desired.  The intent is that during
    the course of loading a file, it will define a variable, local to the file,
    bound to a fresh environment.  An EXPORT option will be used to bind this
    environment to some name, either in the USER environment or in some other
    environment accessible from the GLOBAL environment.   Further EXPORT options
    will be used to share certain of the bindings in the file with that
    now-public environment.  Naturally, those bindings that are not mentioned in
    EXPORT options will remain entirely local to the environment of the file.

Further herald options might be added in the future, in particular to aid in the
specification of a compiler's early-evaluation environment.

If no herald appears at the beginning of a file of code, the LOAD procedure will
behave as if this herald had be given:

		(HERALD
		   (ENV GLOBAL))

That is, such files are evaluated in the normal global environment, as all files
are now.


AN EXAMPLE FILE UNDER THE PROPOSAL

Here is an skeletal example of a file that might be part of a Scheme
implementation of the standard string-handling functions.  In this ficticious
Scheme implementation, the full-language environment is named "FOO-SCHEME" and
is bound to that name as well.

Since it implements a low-level structure like strings, the code in the file
uses certain "sub-primitives" from an environment bound to the name PRIVATE in
the FOO-SCHEME environment.

After the file is loaded, the various procedures defined in the
Revised^3 Report will be bound both to their normal names in the
SCHEME-ESSENTIALS and SCHEME environments and also to more concise names
in a new STRINGS environment, itself bound to the name STRINGS in the
GLOBAL environment.

Thus, a client of the string-handling procedures can either use the
names described in the Revised^3 Report (e.g., STRING-REF, STRING-SET!,
STRING-COPY) or the more ``generic'' names from the STRINGS environment
(e.g., STRINGS:REF, STRINGS:SET!, STRINGS:COPY).  The latter might fit
in well with analogous procedures from environments named LISTS,
VECTORS, TABLES, ENVIRONMENTS, etc.

       (herald
          (env (make-environment "StringsImpl" global foo-scheme:private))
          (export scheme-essentials
             (string-ref ref) (string-set! set) ...)
          (export scheme
             (string-copy copy) ...)
          (export strings
             ref copy (set! set) ...)
          (export global
             strings))
       
       (define strings (make-environment "STRINGS"))
       
       (define (ref string index)
          ...)
       
       (define (copy string)
          ...)
       
       (define (set string index value)
          ...)
       
       ...


Q&A ON THE PROPOSAL

Q: Why is the GLOBAL environment separate from the full-language environment?
   Why not just load files into the full-language environment?

A: There are two answers to this, one based upon modularity and aesthetic
   considerations and the other on future plans for the facility.
   
   The modularity argument is that the full-language environment, like the
   SCHEME or SCHEME-ESSENTIALS environments, should present an unchanging
   interface.  Applications should be able to count on the documented behavior
   of the variables in these environments and should not be affected by the
   presence or absence of other applications except where explicitly arranged
   for.
   
   The future plans argument relates to an idea for eliminating the dependence,
   found in many Lisp implementations, of the semantics of code being compiled
   on the presence or absence of other applications in the system running the 
   compiler.  For example, it is frequently the case that macros defined
   globally in the running system are available to code files compiled in that
   system.  This can (and does) lead to confusing, hidden dependencies.  Within
   this proposal, a compiler could create a new GLOBAL environment (again with
   the full-language environment as its sole parent) and use some new herald
   option as a description of what files to load into that environment to
   establish the proper "early-evaluation" environment for the compilation of
   the file.  When the compilation is done, the ersatz GLOBAL environment can be
   discarded, taking with it any side-effects that might have occurred in the
   process of compilation.  Under such a scheme, we would expect that the
   various language environments would be "locked" against redefinition of any
   of their bindings before the time that any user code is loaded, thus ensuring
   that these environments do, indeed, present an unchanging interface.

Q: In what environment should an implementation's read-eval-print loop evaluate
   the forms typed to it?

A: Certainly any such facility should provide a mechanism whereby any
   environment can be used, but our advice for the choice of an initial or
   default environment would be a "scratch" environment (as it is called in T)
   whose sole parent is the GLOBAL environment.  The use of a separate
   environment here is intended to make it easier for a user to avoid
   accidentally redefining names in the GLOBAL environment, thus possibly
   breaking running applications.  Of course, the Scheme report (properly)
   avoids mention of read-eval-print loops, so no implementation is required to
   have one at all.

Q: Why does the EXPORT herald option share actual bindings, rather than simply
   sharing values?

A: This enables the use of shared, set!-able variables.  Code wishing to take
   advantage of this would simply specify an environment containing the binding
   as an ancestor of the file environment.

Q: Why isn't there an IMPORT herald option that shares bindings in the other
   direction?  It would make the use of such shared variables more controllable
   since one would not have to share all of the variables in the given
   environment.

A: Such an option would not be antithetical to this proposal.