[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

On Modules in Scheme: Principles and Proposals



Well, after carefully reading and considering the comments I've received on the
environment proposal, and after having many many hours of talks with a variety
of people around here on the general subject of modules (be they in Scheme on
some other language), and after having gotten to the point in the development of
our compiler where I need more help than our proposal was offering, I formally
retract our previous proposal.

In its place, I offer the following ideas for discussion.  There are many
details left out (most of them intentionally), but I think that what's here is
the real core of what I believe about what a module facility can and should
provide and how it should be structured.  I have freely stolen ideas and names
from others, both on and off this list, but I have not always left those ideas
and names entirely intact.  Caveat lector.


Principles: There exist written and (at least partially) machine-understandable
statements of the contracts between different pieces of software.  These
``interfaces'' describe all paths of contact between separate pieces of
software; at the level of normal programming (as opposed to facilities provided
by the debugger, for example), no other access is possible.  In addition to
specifying the names of the values available in the interface, other properties
of the values might be given, such as type information, partial or complete
implementations (for integration by, or providing hints to, early evaluators),
formal or informal assertions of the semantics of the values, etc.

Proposals: There exist entities known as ``signatures'' containing at least a
set of names for values and possibly other information as stated above.  These
are not necessarily Scheme data objects.  Signatures have names representable as
Scheme values.  There is a mechanism by which a given signature name can be
associated with a particular signature.  This mechanism may or may not be a part
of the Scheme language.



Principles: It is recognized that interfaces will change over time as providers
and clients learn more about the desired models and paths of communication.  It
is also recognized that early evaluators must, for performance reasons, be
allowed to make certain assumptions about the interfaces (and the values they
represent).  We desire that there be a consistent, non-temporal model of the
semantics of programs, regardless of whether or not they have, at some point,
been partially evaluated.  Thus, it must be possible for an early evaluator to
``build in'' information from the interfaces and to store in the resulting
output some identification of the information used.  This stored identification
should then be used by later evaluators to guarantee that a consistent view of
the possibly-changed interfaces is being employed.

Proposals: Signatures have ``versions'', representable as Scheme values, drawn
from some partial order.  There is a procedure for comparing two versions
according to the partial order.  There is a ``least'' version according to the
partial order; we refer to this as the ``bottom'' version.



Principles: There must be some mechanism by which the code of a client of an
interface can refer to the values the interface represents.  Symmetrically,
there must be a mechanism by which the code of an interface provider can specify
values to be associated with names in that interface.  This should be possible
without either party needing to be aware of the identity or quantity of the
other.  It should be possible for client code to efficiently fetch the values
and for lambda-enclosed references to exist before provider code for the values
has been evaluated.  It should be possible for a provider to supply values for
only a part of an interface.

Proposals: There exist Scheme values called ``structures''.  Creating a
structure requires the name of a signature and values for some, none, or all of
the names in the named signature.  The given values must agree with the
information about them in the signature.  Some or all of that information may be
checked during structure creation.  There is an operation for fetching the
signature name and version from a structure.

There exist Scheme values called ``bindings''.  A binding contains a single
``value'' which may be specified at the time of binding creation; if it is not
specified, the value is said to be ``undefined''.  Structures map names into
bindings.  There are operations for getting the value of a binding, for testing
whether or not the value is undefined and for setting the value; the value may
only be set if it is currently undefined.



Principles: Code that is a client of a particular interface should be clearly
marked as such, for both documentation and semantic purposes.  It should be
possible for a given piece of code to be a client of many interfaces as well as
a (partial or complete) provider of many more.

Proposals: There exist Scheme values called ``modules''.  A module contains,
possibly among other things, an ordered list of signature names and versions
(its ``imports'') and a procedure (its ``body'').  The body takes exactly as
many arguments as there are signature name/version pairs in the imports.  Those
arguments should be structures formed from signatures with matching names and
greater than or equal versions; this assertion is tested by the body and an
error is signalled if the condition is not met.  The body may return any value
or values, as desired.  In particular, the body might return one or more
structures, intending that the invoker will treat them as the provision of
values for some signatures.



Principles: It should be possible for code to refer to the names from some
interfaces without having to qualify them with some identification of the
interface.  For example, the ``built-in'' facilties of the programming language
are a likely choice for such treatment.  However, this set of interfaces should
not be a constant; different programs should be allowed to specify different
sets.  In general, such ``opening'' of an interface can be confusing and
possibly dangerous; accordingly, most imported values should be referred to by
names that indicate the interface from which they are derived.

Proposals: There is a new kind of Scheme expression for the creation of a
module:

	(MODULE (<sig-name> ... (<identifier> <sig-name>) ...)
	   <program>)

where <sig-name> is a signature name.

The first set of names specifies the set of signatures whose names are to be
available without extra qualification.  These are called the ``open (or unnamed)
imports''.  An error is signalled if any two of these signatures share a name;
unqualified references should be unambiguous.

The identifier/signature-name pairs specify the other, ``closed (or named)
imports''.  In the resultant module body, these identifiers will be bound to the
structures passed as arguments to the body; an error will be signalled if any of
these identifiers are among the names in the open imports.

The imports of the module will be the given signature names in the given order.
The version information for the imports will be ``bottom'' unless some
early-evaluator has built in some assumptions about the signatures.

The program has the syntax given in R3RS and this semantics; it is as if all of
the names appearing in DEFINEs were bound in a single LET to unspecified values,
the body of the LET being the sequence of forms resulting from turning all of
the DEFINEs into SET!s.  This expression is evaluated whenever the module body
is invoked; the value(s) of the body invocation are the value(s) of this
expression.  The lexical environment of the body expression is that of the
MODULE expression as a whole, shadowed by bindings of all of the names in the
open imports.



On top of a facility like this, it would be easy to build some useful utilities.
For example, suppose that the LOAD procedure returned a list of the values of
the <command>s in a file.  Then one could imagine a different procedure, call it
FRIENDLY-LOAD, that worked as follows:

-- For each value returned by LOAD, test whether or not it is a module; if not
discard it.

-- For each module, acquire structures for each of its imports and apply the
body of the module to these structures.  FRIENDLY-LOAD maintains a table mapping
signature names and versions to structures.  For each signature named among the
imports for which there is no entry in the table, it creates a fresh structure,
giving no values for bindings, and adds it to the table.

-- If the values returned from the module body invocation are structures, these
are taken to be the ``exports'' of the module.  The signature-names of these
structures are looked up in the table as before, creating fresh structures if
necessary.  Finally, for those names whose bindings in the export structure are
not undefined, the values are copied into the corresponding bindings in the
table structure.

In this way, FRIENDLY-LOAD acts as a system-wide matchmaker, hooking together
the clients and suppliers of interfaces.  Of course, no one would have to use
this facility; one could put an entirely different one together out of the
primitives provided.



I've typed enough here.  Comments?

	Pavel