[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A Proposal for Environments in Scheme



I've received four responses to the environments proposal.  Let me try to answer
the questions posed in those responses.

    From: David Bartley <bartley@mips.csc.ti.com>

    -- Suppose we had (HERALD (EXPORT ENV (FOO BAR))).  What is the effect
    if FOO already has a binding in ENV?

In our current implementation of Cedar Scheme, the effect is to lose the old
binding for FOO.  This qoesn't quite feel right, but I don't see a way around it
if exporting is really going to share bindings.  It may be that exported
SET!-able variables are frowned upon enough that sharing bindings is not
considered worthwhile; it that case, we could just set the current binding of
FOO.  This has the pleasant implication that the binding of a symbol in an
environment is constant; thus code can look up that binding exactly once at
load-time and count on seeing changes to the name-value mapping that happen
later, such as by redefinition.

    What is the effect if BAR is
    bound in an ancestor of the current (file's) environment but not
    directly in the current environment?

There is no difference between this cad the usual case; the binding is shared.

    -- What is the effect of (HERALD (EXPORT ENV (A B) (C B)))?  Is this
    an error or are all three identifiers bound to the same location?

All three identifiers are bound to the same location.

    -- In your Q&A section, you mention "locking" of environments to
    protect them against change.  This is an intriguing idea.  Would this
    be done at the environment level or a binding at a time?

My intent was that the locking would happen at the environment level.  That is,
all bindings in the environment would be rendered un-SET!-able and no new
bindings would be allowed.

    Can one
    export a binding from an unlocked environment into a locked
    environment?

Since this would involve the addition of a new binding to the locked
environment, the answer is no.

    Likewise, what
    happens if you export a binding from a locked environment into an
    unlocked one?

Hmm.  I suppose that this would imply that that binding was un-SET!-able,
regardless of what environment you found it in.  This seems to imply that
locking is happening both at the environment level and at the level of the
individual bindings.

    -- The EXPORT mechanism doesn't require any hierarchical or other
    relationship between the specified environment and the
    current environment, so it is possible to inject a shared binding into
    any environment you can get your hands on.  Such a binding could be a
    trojan horse.  Perhaps it should be possible to "lock" an environment
    against being exported into (?!) as well?

This is precisely the question you asked earlier about exporting from an
unlocked environment into a locked one.

    -- Have you found it to be important to have the `:' notation as a
    short-hand for ENVIRONMENT-REF ?  My previous proposals to reserve the
    `:' character for CL-style package notation have been soundly drubbed.

We like to think about environments like STRINGS in the example as
``interfaces'' in the sense of languages like Cedar (surprise...) and Modula-2.
Thus, in general, we believe that programs will not have environments that
inherit from environments like STRINGS but will explicitly ``qualify'' all
references to names in such environments.  We have found that when systems like
this get going and there are a large number of environments/interfaces around,
inheriting from an interface environment (it's called OPENing the interface in
Cedar) is very confusing in general.  There are cases where it makes sense to
OPEN exactly one interface (for example, all of the Cedar code implementing the
Scheme interpreter and primitives OPENs the Scheme interface, which provides the
type declarations and useful procedures like LookupVariableValue), but the vast
majority of references to names in other environments are fully-qualified.  If
one had to use an explicit call to ENVIRONMENT-REF (or even ACCESS in C-Scheme,
which is shorter), then the typing/reading penalty for separating environments
would likely outweigh the modularity benefits.  Thus, we feel quite strongly
that some low-overhead notation for ``structured names'' is necessary.

    Just for grins, how would you react to a notation using curly braces, so
    your A:B1:B2 becomes {{A}B1}B2, and STRINGS:SET! becomes {STRINGS}SET! ?

Hmm.  Well, it's not entirely grotesque, but I think that the colons work
better.  Cedar uses periods where we use colons, but I think too many people use
periods to separate words in identifiers for this to go over well.  Besides, the
colons here have a purpose not entirely unlike colons in Common Lisp, though
they have much cleaner semantics here.

    From: Morris Katz <MKATZ@A.ISI.EDU>

    1)  I believe it is a big mistake to require that all environments have a
name.
    There was a good reason(s) for deciding that not all procedures in Lisp
should
    have to be named, and I believe that those arguments apply equally well to
this
    case.

I think that you misunderstood the purpose of the <id> argument to
MAKE-ENVIRONMENT.  It is not a name in the sense that a given name maps to a
single environment.  Its only purpose is to be some nice thing to show as a part
of the printed representation of the environment.  It also helps humans have a
more reasonable handle for talking about environments.  The <id> has absolutely
no semantic content except that it can be extracted from the environment with
the procedure ENVIRONMENT-ID.  There are no guarantees that <id>'s are unique
and no functions for mapping an <id> into the environments having it.

    2)  I have been interested in the semantics and use of multiple environment
    inheritance for some time; but, I am not sure that it is well enough
understood
    to become part of R3RS.  (Maybe someone else understands it better than I do
    and can convince me otherwise.)  In particular, I am concerned about
efficient 
    implementations for interpreted code, difficulty in understanding code which
    utilizes this feature, and the development of suitable browser technology to
    make debugging of programs using it tractable.

It is easy to implement multiple parents efficiently in the interpreter.  In
Cedar Scheme, environments have two field for parent information; one of the
fields holds the first parent and the other holds a list of the other parents.
In almost all cases, single inheritance is used and the otherParents field is
NIL.  Thus, no consing of environment lists occurs in the interpreter.  The
variable lookup procedure is straightforward and, in the case where otherParents
is NIL, is precisely the same as the single inheritance case.

As I pointed out above, extensive use of multiple parents can indeed make code
hard to read.  On the other hand, judicious and controlled use works quite well
to improve readability.  The best use is to include certain optional language
features that are kept in separate environments for just this purpose.  For
example, in Cedar Scheme, there is an environment called ITERATE that provides
the entry points for a fancy iteration facility.  Not everyone likes the
facility, so it is not a part of the full language we export.  However, programs
wishing to use the facility can include ITERATE as a parent of their
environments and thus get seamless access to it.

I have not found fancy browser technology necessary for debugging such code.
The fact that environments print with a useful name (like the name of the file
for most code) and the fact that our debugger prints out the environment of the
current expression has made it very reasonable to work with these environments.

    3)  It is my belief that the loader should load the contents of files into
the 
    environment of the REP LOOP from which the load was initiated, unless 
    specified otherwise in the load command.  (e.g., (load "foo" bar) would load
    the file "foo" into the environment bar.)  Making all files load into the 
    global environment unless specified otherwise in a herald seems artificial
and
    constraining.  What if one wants to load two slightly different copies of a
    system into two sets of environments and compare their execution?  Using the
    herald approach I believe this would require copying all of the files and 
    changing the heralds.  The logical extension of thiproach would be that 
    <expression> in (env <expresssion>) in a herald be evaluated in the load
    environment, rather than the global environment.

Please note that, under our proposal, any file containing a herald is loaded
into a *fresh* environment inheriting from the GLOBAL environment.  Thus,
programs in different files are protected from each other.  I have no objection
to an optional argument to LOAD that provided it with a different idea of what
the GLOBAL environment was.  This would allow you to load the same program into
multiple environments for the effect you ask for above.

We're trying, in this proposal, to make it possible for large numbers of
applications to coexist with some reasonable assurances about namespace
protection.  We envision systems the size of Cedar or the Symbolics Genera
system with many hundreds of software packages all loaded into the same address
space and working together well because of separate namespaces and well-defined
interfaces.  This level of modularity is new to Scheme and the other Lisp-like
languages but is now commonplace in systems like Cedar and Modula-2+.  The
proposal lets this happen while still allowing smaller systems the very same
flexibilities currently allowed in most implementations.

    4) I can't quite explain why, but the entire export section makes me feel a 
    little uneasy.  While there are some things that can be done convieniently 
    through the unification of two symbols in different environments, I get the 
    feeling that this feature can very quckly get one into trouble.  How do we
    build debuggers to help support its use?

In Cedar Scheme, the binding object contains more information that the value of
the binding.  It also contains the original name for the binding in the
environment in which it was created.  If we wished, we could also store that
original environment in there and any other information that one might care to
have around for debugging.  I don't see that this proposal makes debugging any
more difficult.

    What are the costs added to normal lexical lookup due to this feature?

The only cost is that ``top-level'' environments (the kind made by
MAKE-ENVIRONMENT) must use an extra level of indirection to get to the value for
a name.  Local environments, used by the interpreter for normal lambda-binding,
need not do this, since they can never be accessed by a program and can thus
never be named in an export clause.

    Does it interact pathologically with incremental definition, or is this
    just an erroneous gut level feeling?

As I described in answer to David's questions above, there is a funny
interaction here with REdefinition, but not with simple incremental definition.

    From: Andy Freeman <andy@polya.stanford.edu>
    
    In the herald, env is bad for the same reasons that exp would be.

Granted.  I would not object to the use of ``environment'' instead of ``env''.

    Regarding the presentation, it would be nice you mentioned using (or
    not, as appropriate) the scheme-essentials environment to implement
    the scheme and <implementation> environment.  At first reading, it
    looks like these environments have to have bindings for all of the
    language even if they could inherit it from scheme-essentials but your
    example makes it clear that they can use inheritance.

I wanted the semantics of the environments to be clear.  In particular, I did
not want to make it seem as if the various environments were required to inherit
in the obvious fashion.  The only required inheritance link is between the
GLOBAL and full-language environments.

    I understand why you didn't write much about the default environment
    in various situations, but someone should.

I'm not sure I understand what you mean here.  I think that I described all of
the environments in which evaluation takes place in R3RS Scheme.

    I know that it can be written, but it would be nice if one of the
    optional functions told which environment a symbol was bound in.  <I
    forget>-bound? tells you whether a symbol is bound in an environment
    or one of its ancestors, but knowing which one would be nice.  Argh,
    it might even be worth knowing all of the environments that a symbol
    is bound in.

I took the view of the rest of R3RS Scheme, that those things a user could write
for themself need not be included in the language.

    Should there be an environment analog of call-with-current-continuation?

I've not yet seen a good use for this.

    BTW - Why did you chose to use strings for identifying environments
    instead of say, symbols?

Let me point out that the <id> value can be any Scheme value at all, since it
has no semantics.  I chose strings instead of symbols in order to reduce
confusion of <id>'s with the names to which environments are bound in certain
other environments.  Thus the difference between the "SCHEME" environment and
the value bound to the symbol 'scheme in that environment.

    From: John D. Ramsdell <ramsdell%linus@mitre-bedford.ARPA>
    
    What exactly is an identifier?  Is it a symbol whose print name does
    not end or begin with a colon?  If so, the print name determines the
    interpretation of an identifier.  This mixes up the concept of how to
    print a symbol with the concept of how to use it as an identifier.

    Maybe colon is a read time macro, and no symbol is allowed to contain
    colon.  

    [1] a:b  ==>  (eval (ENVIRONMENT-REF B (QUOTE A)))

    [2] 'a:b  ==>  (ENVIRONMENT-REF B (QUOTE A))

    [3] (string->symbol "A:B")  ==>  ?

    By your description [1] is true, what about [2]?  What is [3]?

An identifier is a token precisely as described in R3RS.  Its internal
representation is less clear.  In the same way that R3RS waffles about what
happens when I type

		''foo

to my read-eval-print loop, I guess I'd better waffle about colon here.  I can
tell you that in Cedar Scheme, when the reader finds an identifer by R3RS
syntax, it looks for colons in the name and, if found, constructs an
ENVIRONMENT-REF expression to return.  Thus, in a sense, colons are treated as a
read macro in Cedar Scheme.

By the way, in your choices above, you've got the arguments to ENVIRONMENT-REF
backwards.  In both cases, the expression should be

		(ENVIRONMENT-REF A (QUOTE B))

This is all of the responses I've received so far.  Further comment?

	Pavel