[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Pavel.pa: A Proposal for Environments in Scheme]
A number of people seem to have not received this, so I am re-sending it.
- Jonathan
Date: Fri, 20 May 88 11:49:26 PDT
From: Pavel.pa at Xerox.COM
To: rrrs-authors at mc.lcs.mit.edu
Re: A Proposal for Environments in Scheme
I'd like to offer the following proposal for a facility for first-class
environments in Scheme. We have found it to work quite cleanly and powerfully
in Cedar Scheme and thus believe that it is worthy of consideration for
inclusion in the standard language.
Pavel Curtis
Xerox PARC/CSL
INTRODUCTION
If Scheme is to grow and be used by more people, one of the problems it must
solve is the robust isolation of different pieces of the system and user code
from each other. The system described in this section is an attempt to cleanly
and simply solve this problem, with an eye toward making the addition of a true
file-compiler a simple extension with clean semantics. It is fully implemented
in Cedar Scheme.
Note that this system does not attempt to address the questions of interfaces
vs. implementations, interface-version compatibility checking, etc. We simply
want it to be possible for normal users to write code that is insulated
reasonably well from the system and from other users.
The presented solution owes a debt to the designers of T (in many obvious ways)
and probably those of other Scheme dialects. As well, it is influenced by our
extensive experience programming in the Cedar language and environment, with its
heavy emphasis on well-defined interfaces. The proposal given here is not,
however, truly the same as any other system of which we're aware; in particular,
it departs strongly from T in some fundamental ways, so draw no premature
conclusions.
The proposal begins by specifying a Scheme interface to first-class
environments, a cornerstone of the new facility. It then describes the initial
set of environments in the system; this arrangement is in place >before< the
first file of code is LOADed. Next, it suggests a syntax and semantics for
files of Scheme code that allows fine-grained but convenient control over the
environment structure. Finally, it presents an example of a file using the new
facility.
The proposal proper is followed by some answers to questions that we anticipate
conerning the proposal.
FIRST-CLASS ENVIRONMENTS
We propose the addition to Scheme of first-class environment values. These are
precisely the environments currently used by the Scheme interpreter with one
cosmetic addition, a human-readable >identifier< for the environment. The
procedures defined on environments are as follows:
MAKE-ENVIRONMENT <id> <parent> ... [Procedure]
Creates and returns a new environment with the given <parents> (if any) and
the given identifying value, <id>, usually a string. <Id> is strictly for
debugging purposes; it is output as a part of the printed representation of
the new environment. The <parents>, on the other hand, are of semantic
interest, since variable-lookup in the resulting environment will continue
into the parents if the requested symbol has no binding in the child
environment. Variable lookup is depth-first and left-to-right among
multiple parents.
ENVIRONMENT? <object> [Procedure]
Returns true if and only if <object> is an environment.
ENVIRONMENT-ID <env> [Procedure]
Return the value specified for the <id> parameter to the call to
MAKE-ENVIRONMENT that created <env>, or #F if none was specified.
ENVIRONMENT-PARENTS <env> [Procedure]
Return a list of the values specified for the <parent> parameters to the
call to MAKE-ENVIRONMENT that created <env>. It is an error to perform
destructive operations on this list.
ENVIRONMENT-REF <env> <symbol> [Procedure]
Return the value bound to <symbol> in <env> (or its ancestors), signalling
an error if no such binding exists.
ENVIRONMENT-SET! <env> <symbol> <value> [Procedure]
Change the value bound to <symbol> in <env> (or its ancestors) to <value>,
signalling an error if no such binding exists.
ENVIRONMENT-DEFINE! <env> <symbol> <value> [Procedure]
Change the value bound to <symbol> in <env> (NOT its ancestors) to <value>,
adding such a binding to <env> if none exists. Note that
ENVIRONMENT-DEFINE! never affects any ancestor of <env>, only <env> itself.
ENVIRONMENT-BOUND? <env> <symbol> [Procedure]
Return true if and only if there exists a binding for <symbol> in <env>
(or its ancestors).
WALK-ENVIRONMENT <fn> <env> [Procedure]
<Fn> should be a procedure of two arguments, a symbol and a value. It is
applied to every binding in <env> itself, NOT including bindings in its
ancestors.
In addition to these procedures, We propose a change to the meanings of
identifiers whose names include colons:
-- No identifier may begin or end with a colon.
(Alternatively, such identifiers behave as they do now.)
-- An identifier of the form a:b1:...:bk (k >= 1) is entirely equivalent to
the expression
(ENVIRONMENT-REF a:b1:...:bk-1 'bk)
This change allows convenient reference to the values of bindings in an
environment that is bound to some variable in the current environment. For
example, the identifier STRINGS:COPY is equivalent to the expression
(ENVIRONMENT-REF STRINGS 'COPY)
and the identifier CEDAR:IO:RESET is equivalent to the expression
(ENVIRONMENT-REF (ENVIRONMENT-REF CEDAR 'IO) 'RESET)
THE INITIAL ENVIRONMENT STRUCTURE
The initial structure of environments, as seen by the first file LOADed into the
system, consists of at least the following four environments:
"SCHEME-ESSENTIALS"
Contains exactly those bindings described as "essential" in the Scheme
specification, with the semantics given there. Thus, in this environment,
APPEND takes exactly two arguments, LET does not allow the "named-LET"
variant, internal DEFINEs are not allowed, etc.
In addition, the name SCHEME-ESSENTIALS is bound to the environment
itself.
This environment is intended for use by those laboring under severe
portability constraints.
"SCHEME"
Contains exactly those bindings described in the Scheme specification,
except that a given implementation may not provide all of the optional
features. Thus, in this environment, APPEND may or may not accept an
arbitrary number of arguments, LET may or may not allow the named variant,
etc. The strongest statement that can be made about this environment is
that it certainly contains no more than what is described in R3RS and no
less than the "SCHEME-ESSENTIALS" environment.
In addition, the names SCHEME-ESSENTIALS and SCHEME are bound to the
"SCHEME-ESSENTIALS" environment and this environment itself, respectively.
This environment is intended for those who want to work in a reasonably
portable setting with no extras.
Some implementation-dependent name
Contains whatever the implementation considers the bindings of the
language it implements. It should contain everything found in the
"SCHEME" environment, though some may have been extended upward-
compatibly. In addition, other facilities, not described in R3RS,
may be visible here.
Some name should be bound to the environment itself. In addition, the names
SCHEME-ESSENTIALS and SCHEME are bound to the "SCHEME-ESSENTIALS" and
"SCHEME" environments, respectively.
It is assumed that most code written in the implementation will be
evaluated in an environment descended from this one.
In the rest of this proposal, this environment will be called the "full-
language" environment.
"GLOBAL"
Has as its only parent the full-language environment. In addition, the name
GLOBAL is bound to this environment itself.
The GLOBAL environment is distinguished in the definition of the semantics
of code files, below.
By convention, most files of code are evaluated in environments descended
from this one and most other "public" environments will be accessible
through names in the GLOBAL environment.
Implementation Note: In Cedar Scheme, for example, the initial environment
structure is created by code like the following:
(let* ((scheme-essentials (make-environment "SCHEME-ESSENTIALS"))
(scheme (make-environment "SCHEME" scheme-essentials))
(cedar-scheme (make-environment "CEDAR-SCHEME" scheme))
(global (make-environment "GLOBAL" cedar-scheme)))
(environment-define! scheme-essentials
'scheme-essentials
scheme-essentials)
(environment-define! scheme 'scheme scheme)
(environment-define! cedar-scheme 'cedar-scheme cedar-scheme)
(environment-define! global 'global global)
...)
Though it is not required by this proposal, all four environments in the initial
structure in Cedar Scheme form a chain through the parent relation. Only the
GLOBAL and full-language environments are required to be so linked.
[End note]
THE SYNTAX AND SEMANTICS OF FILES
In the usual case, a file of Scheme code contains the definitions of a few
variables (most frequently bound to procedures) intended for use by clients of
the software and several more ``helper'' variables intended only for use by the
package itself. The variables to be externally available usually belong in a
separate space of names from those in any other package. The following syntax
and semantics of files, which is upwardly-compatible with what is currently in
use in most dialects of Scheme, is intended to make the usual case easy while
allowing more complex cases to be handled smoothly as well.
A file of Scheme code, under this proposal, may optionally begin with a piece of
syntax, called a >herald<, that arranges for special treatment of the
environmental context of the code in the file. Heralds obey the following
syntax:
(HERALD <option>+)
where each <option> is one of the following:
(ENV <expression>)
The given <expression> is evaluated in the GLOBAL environment described
above before loading the rest of the file. <Expression> should yield an
environment in which the rest of the file will be evaluated. This option
may only appear once, if at all. It it does not appear, LOAD behaves as if
the option
(ENV (MAKE-ENVIRONMENT <file-name> GLOBAL))
had been given, where <file-name> is a string naming the file from which the
code is being loaded. This default is proper for most single-file packages.
Larger applications might have an initial file that creates a new
environment and binds it to some name in the GLOBAL environment; later files
would specify an ENV option naming that environment. In this way, a multi-
file application can share a single environment among many files.
(EXPORT <expression> <export-spec>+)
An <export-spec> is either a simple identifier (one without colons) or a
list of two simple identifiers; an <export-spec> that is a simple identifier
<id> is exactly equivalent to the <export-spec> (<id> <id>). The given
<expression> is evaluated in the same environment as the rest of the file
(see the description of the ENV herald option, above) after the rest of the
file has been loaded. It should yield an environment, say <env>. For each
<export-spec> (<id>1 <id>2), the environment <env> is side-effected such
that the binding for <id>1 in <env> is identical to that for <id>2 in the
environment for this file. Two bindings are ``identical'' if a change to
the value of one produces the same change to the value of the other. In
denotational terms, the two environments map the corresponding identifiers
to the very same location.
This option may appear as many times as desired. The intent is that during
the course of loading a file, it will define a variable, local to the file,
bound to a fresh environment. An EXPORT option will be used to bind this
environment to some name, either in the USER environment or in some other
environment accessible from the GLOBAL environment. Further EXPORT options
will be used to share certain of the bindings in the file with that
now-public environment. Naturally, those bindings that are not mentioned in
EXPORT options will remain entirely local to the environment of the file.
Further herald options might be added in the future, in particular to aid in the
specification of a compiler's early-evaluation environment.
If no herald appears at the beginning of a file of code, the LOAD procedure will
behave as if this herald had be given:
(HERALD
(ENV GLOBAL))
That is, such files are evaluated in the normal global environment, as all files
are now.
AN EXAMPLE FILE UNDER THE PROPOSAL
Here is an skeletal example of a file that might be part of a Scheme
implementation of the standard string-handling functions. In this ficticious
Scheme implementation, the full-language environment is named "FOO-SCHEME" and
is bound to that name as well.
Since it implements a low-level structure like strings, the code in the file
uses certain "sub-primitives" from an environment bound to the name PRIVATE in
the FOO-SCHEME environment.
After the file is loaded, the various procedures defined in the
Revised^3 Report will be bound both to their normal names in the
SCHEME-ESSENTIALS and SCHEME environments and also to more concise names
in a new STRINGS environment, itself bound to the name STRINGS in the
GLOBAL environment.
Thus, a client of the string-handling procedures can either use the
names described in the Revised^3 Report (e.g., STRING-REF, STRING-SET!,
STRING-COPY) or the more ``generic'' names from the STRINGS environment
(e.g., STRINGS:REF, STRINGS:SET!, STRINGS:COPY). The latter might fit
in well with analogous procedures from environments named LISTS,
VECTORS, TABLES, ENVIRONMENTS, etc.
(herald
(env (make-environment "StringsImpl" global foo-scheme:private))
(export scheme-essentials
(string-ref ref) (string-set! set) ...)
(export scheme
(string-copy copy) ...)
(export strings
ref copy (set! set) ...)
(export global
strings))
(define strings (make-environment "STRINGS"))
(define (ref string index)
...)
(define (copy string)
...)
(define (set string index value)
...)
...
Q&A ON THE PROPOSAL
Q: Why is the GLOBAL environment separate from the full-language environment?
Why not just load files into the full-language environment?
A: There are two answers to this, one based upon modularity and aesthetic
considerations and the other on future plans for the facility.
The modularity argument is that the full-language environment, like the
SCHEME or SCHEME-ESSENTIALS environments, should present an unchanging
interface. Applications should be able to count on the documented behavior
of the variables in these environments and should not be affected by the
presence or absence of other applications except where explicitly arranged
for.
The future plans argument relates to an idea for eliminating the dependence,
found in many Lisp implementations, of the semantics of code being compiled
on the presence or absence of other applications in the system running the
compiler. For example, it is frequently the case that macros defined
globally in the running system are available to code files compiled in that
system. This can (and does) lead to confusing, hidden dependencies. Within
this proposal, a compiler could create a new GLOBAL environment (again with
the full-language environment as its sole parent) and use some new herald
option as a description of what files to load into that environment to
establish the proper "early-evaluation" environment for the compilation of
the file. When the compilation is done, the ersatz GLOBAL environment can be
discarded, taking with it any side-effects that might have occurred in the
process of compilation. Under such a scheme, we would expect that the
various language environments would be "locked" against redefinition of any
of their bindings before the time that any user code is loaded, thus ensuring
that these environments do, indeed, present an unchanging interface.
Q: In what environment should an implementation's read-eval-print loop evaluate
the forms typed to it?
A: Certainly any such facility should provide a mechanism whereby any
environment can be used, but our advice for the choice of an initial or
default environment would be a "scratch" environment (as it is called in T)
whose sole parent is the GLOBAL environment. The use of a separate
environment here is intended to make it easier for a user to avoid
accidentally redefining names in the GLOBAL environment, thus possibly
breaking running applications. Of course, the Scheme report (properly)
avoids mention of read-eval-print loops, so no implementation is required to
have one at all.
Q: Why does the EXPORT herald option share actual bindings, rather than simply
sharing values?
A: This enables the use of shared, set!-able variables. Code wishing to take
advantage of this would simply specify an environment containing the binding
as an ancestor of the file environment.
Q: Why isn't there an IMPORT herald option that shares bindings in the other
direction? It would make the use of such shared variables more controllable
since one would not have to share all of the variables in the given
environment.
A: Such an option would not be antithetical to this proposal.