Go to the first, previous, next, last section, table of contents.

Overview

This manual is a detailed description of the MIT Scheme runtime system. It is intended to be a reference document for programmers. It does not describe how to run Scheme or how to interact with it -- that is the subject of the MIT Scheme User's Manual.

This chapter summarizes the semantics of Scheme, briefly describes the MIT Scheme programming environment, and explains the syntactic and lexical conventions of the language. Subsequent chapters describe special forms, numerous data abstractions, and facilities for input and output.

Throughout this manual, we will make frequent references to standard Scheme, which is the language defined by the document Revised^4 Report on the Algorithmic Language Scheme, by William Clinger, Jonathan Rees, et al., or by IEEE Std. 1178-1990, IEEE Standard for the Scheme Programming Language (in fact, several parts of this document are copied from the Revised Report). MIT Scheme is an extension of standard Scheme.

These are the significant semantic characteristics of the Scheme language:

Variables are statically scoped: Scheme is a statically scoped programming language, which means that each use of a variable is associated with a lexically apparent binding of that variable. Algol is another statically scoped language.
Types are latent: Scheme has latent types as opposed to manifest types, which means that Scheme associates types with values (or objects) rather than with variables. Other languages with latent types (also referred to as weakly typed or dynamically typed languages) include APL, Snobol, and other dialects of Lisp. Languages with manifest types (sometimes referred to as strongly typed or statically typed languages) include Algol 60, Pascal, and C.
Objects have unlimited extent: All objects created during a Scheme computation, including procedures and continuations, have unlimited extent; no Scheme object is ever destroyed. The system doesn't run out of memory because the garbage collector reclaims the storage occupied by an object when the object cannot possibly be needed by a future computation. Other languages in which most objects have unlimited extent include APL and other Lisp dialects.
Proper tail recursion: Scheme is properly tail-recursive, which means that iterative computation can occur in constant space, even if the iterative computation is described by a syntactically recursive procedure. With a tail-recursive implementation, you can express iteration using the ordinary procedure-call mechanics; special iteration expressions are provided only for syntactic convenience.
Procedures are objects: Scheme procedures are objects, which means that you can create them dynamically, store them in data structures, return them as the results of other procedures, and so on. Other languages with such procedure objects include Common Lisp and ML.
Continuations are explicit: In most other languages, continuations operate behind the scenes. In Scheme, continuations are objects; you can use continuations for implementing a variety of advanced control constructs, including non-local exits, backtracking, and coroutines.
Arguments are passed by value: Arguments to Scheme procedures are passed by value, which means that Scheme evaluates the argument expressions before the procedure gains control, whether or not the procedure needs the result of the evaluations. ML, C, and APL are three other languages that pass arguments by value. In languages such as SASL and Algol 60, argument expressions are not evaluated unless the values are needed by the procedure.

Scheme uses a parenthesized-list Polish notation to describe programs and (other) data. The syntax of Scheme, like that of most Lisp dialects, provides for great expressive power, largely due to its simplicity. An important consequence of this simplicity is the susceptibility of Scheme programs and data to uniform treatment by other Scheme programs. As with other Lisp dialects, the read primitive parses its input; that is, it performs syntactic as well as lexical decomposition of what it reads.

Notational Conventions

This section details the notational conventions used throughout the rest of this document.

Errors

When this manual uses the phrase "an error will be signalled," it means that Scheme will call error, which normally halts execution of the program and prints an error message.

When this manual uses the phrase "it is an error," it means that the specified action is not valid in Scheme, but the system may or may not signal the error. When this manual says that something "must be," it means that violating the requirement is an error.

Examples

This manual gives many examples showing the evaluation of expressions. The examples have a common format that shows the expression being evaluated on the left hand side, an "arrow" in the middle, and the value of the expression written on the right. For example:

(+ 1 2)          =>  3

Sometimes the arrow and value will be moved under the expression, due to lack of space. Occasionally we will not care what the value is, in which case both the arrow and the value are omitted.

If an example shows an evaluation that results in an error, an error message is shown, prefaced by `error-->':

(+ 1 'foo)                      error--> Illegal datum

An example that shows printed output marks it with `-|':

(begin (write 'foo) 'bar)
     -| foo
     => bar

When this manual indicates that the value returned by some expression is unspecified, it means that the expression will evaluate to some object without signalling an error, but that programs should not depend on the value in any way.

Entry Format

Each description of an MIT Scheme variable, special form, or procedure begins with one or more header lines in this format:

@deffnexample category template

where category specifies the kind of item ("variable", "special form", or "procedure"), and how the item conforms to standard Scheme, as follows:

category: Category, with no extra marking, indicates that the item is described in the Revised^4 Report on the Algorithmic Language Scheme.
category+: A plus sign after category indicates that the item is an MIT Scheme extension.

The form of template is interpreted depending on category.

Variable

Template consists of the variable's name.

Special Form

Template starts with the syntactic keyword of the special form, followed by a description of the special form's syntax. The description is written using the following conventions. Named components are italicized in the printed manual, and uppercase in the Info file. "Noise" keywords, such as the else keyword in the cond special form, are set in a fixed width font in the printed manual; in the Info file they are not distinguished. Parentheses indicate themselves. A horizontal ellipsis (...) is describes repeated components. Specifically,

thing ...

indicates zero or more occurrences of thing, while

thing thing ...

indicates one or more occurrences of thing. Brackets, [ ], enclose optional components. Several special forms (e.g. lambda) have an internal component consisting of a series of expressions; usually these expressions are evaluated sequentially under conditions that are specified in the description of the special form. This sequence of expressions is commonly referred to as the body of the special form.

Procedure

Template starts with the name of the variable to which the procedure is bound, followed by a description of the procedure's arguments. The arguments are described using "lambda list" notation (see section Lambda Expressions), except that brackets are used to denote optional arguments, and ellipses are used to denote "rest" arguments. The names of the procedure's arguments are italicized in the printed manual, and uppercase in the Info file. When an argument names a Scheme data type, it indicates that the argument must be that type of data object. For example, @deffnexample procedure cdr pair indicates that the standard Scheme procedure cdr takes one argument, which must be a pair. Many procedures signal an error when an argument is of the wrong type; usually this error is a condition of type condition-type:wrong-type-argument. In addition to the standard data-type names (pair, list, boolean, string, etc.), the following names as arguments also imply type restrictions:

object: any object
thunk: a procedure of no arguments
x, y: real numbers
q, n: integers
k: an exact non-negative integer

Some examples:

@deffnexample procedure list object ...

indicates that the standard Scheme procedure list takes zero or more arguments, each of which may be any Scheme object.

@deffnexample procedure write-char char [output-port]

indicates that the standard Scheme procedure write-char must be called with a character, char, and may also be called with a character and an output port.

Scheme Concepts

Variable Bindings

Any identifier that is not a syntactic keyword may be used as a variable (see section Identifiers). A variable may name a location where a value can be stored. A variable that does so is said to be bound to the location. The value stored in the location to which a variable is bound is called the variable's value. (The variable is sometimes said to name the value or to be bound to the value.)

A variable may be bound but still not have a value; such a variable is said to be unassigned. Referencing an unassigned variable is an error. When this error is signalled, it is a condition of type condition-type:unassigned-variable; sometimes the compiler does not generate code to signal the error. Unassigned variables are useful only in combination with side effects (see section Assignments).

Environment Concepts

An environment is a set of variable bindings. If an environment has no binding for a variable, that variable is said to be unbound in that environment. Referencing an unbound variable signals a condition of type condition-type:unbound-variable.

A new environment can be created by extending an existing environment with a set of new bindings. Note that "extending an environment" does not modify the environment; rather, it creates a new environment that contains the new bindings and the old ones. The new bindings shadow the old ones; that is, if an environment that contains a binding for x is extended with a new binding for x, then only the new binding is seen when x is looked up in the extended environment. Sometimes we say that the original environment is the parent of the new one, or that the new environment is a child of the old one, or that the new environment inherits the bindings in the old one.

Procedure calls extend an environment, as do let, let*, letrec, and do expressions. Internal definitions (see section Internal Definitions) also extend an environment. (Actually, all the constructs that extend environments can be expressed in terms of procedure calls, so there is really just one fundamental mechanism for environment extension.) A top-level definition (see section Top-Level Definitions) may add a binding to an existing environment.

Initial and Current Environments

MIT Scheme provides an initial environment that contains all of the variable bindings described in this manual. Most environments are ultimately extensions of this initial environment. In Scheme, the environment in which your programs execute is actually a child (extension) of the environment containing the system's bindings. Thus, system names are visible to your programs, but your names do not interfere with system programs.

The environment in effect at some point in a program is called the current environment at that point. In particular, every REP loop has a current environment. (REP stands for "read-eval-print"; the REP loop is the Scheme program that reads your input, evaluates it, and prints the result.) The environment of the top-level REP loop (the one you are in when Scheme starts up) starts as user-initial-environment, although it can be changed by the ge procedure. When a new REP loop is created, its environment is determined by the program that creates it.

Static Scoping

Scheme is a statically scoped language with block structure. In this respect, it is like Algol and Pascal, and unlike most other dialects of Lisp except for Common Lisp.

The fact that Scheme is statically scoped (rather than dynamically bound) means that the environment that is extended (and becomes current) when a procedure is called is the environment in which the procedure was created (i.e. in which the procedure's defining lambda expression was evaluated), not the environment in which the procedure is called. Because all the other Scheme binding expressions can be expressed in terms of procedures, this determines how all bindings behave.

Consider the following definitions, made at the top-level REP loop (in the initial environment):

(define x 1)
(define (f x) (g 2))
(define (g y) (+ x y))
(f 5)                                       =>  3 ; not 7

Here f and g are bound to procedures created in the initial environment. Because Scheme is statically scoped, the call to g from f extends the initial environment (the one in which g was created) with a binding of y to 2. In this extended environment, y is 2 and x is 1. (In a dynamically bound Lisp, the call to g would extend the environment in effect during the call to f, in which x is bound to 5 by the call to f, and the answer would be 7.)

Note that with static scoping, you can tell what binding a variable reference refers to just from looking at the text of the program; the referenced binding cannot depend on how the program is used. That is, the nesting of environments (their parent-child relationship) corresponds to the nesting of binding expressions in program text. (Because of this connection to the text of the program, static scoping is also called lexical scoping.) For each place where a variable is bound in a program there is a corresponding region of the program text within which the binding is effective. For example, the region of a binding established by a lambda expression is the entire body of the lambda expression. The documentation of each binding expression explains what the region of the bindings it makes is. A use of a variable (that is, a reference to or assignment of a variable) refers to the innermost binding of that variable whose region contains the variable use. If there is no such region, the use refers to the binding of the variable in the global environment (which is an ancestor of all other environments, and can be thought of as a region in which all your programs are contained).

True and False

In Scheme, the boolean values true and false are denoted by #t and #f. However, any Scheme value can be treated as a boolean for the purpose of a conditional test. This manual uses the word true to refer to any Scheme value that counts as true, and the word false to refer to any Scheme value that counts as false. In conditional tests, all values count as true except for #f, which counts as false (see section Conditionals).

Implementation note: In MIT Scheme, #f and the empty list are the same object, and the printed representation of #f is always `()'. As this contradicts the Scheme standard, MIT Scheme will soon be changed to make #f and the empty list different objects.

External Representations

An important concept in Scheme is that of the external representation of an object as a sequence of characters. For example, an external representation of the integer 28 is the sequence of characters `28', and an external representation of a list consisting of the integers 8 and 13 is the sequence of characters `(8 13)'.

The external representation of an object is not necessarily unique. The integer 28 also has representations `#e28.000' and `#x1c', and the list in the previous paragraph also has the representations `( 08 13 )' and `(8 . (13 . ( )))'.

Many objects have standard external representations, but some, such as procedures and circular data structures, do not have standard representations (although particular implementations may define representations for them).

An external representation may be written in a program to obtain the corresponding object (see section Quoting).

External representations can also be used for input and output. The procedure read parses external representations, and the procedure write generates them. Together, they provide an elegant and powerful input/output facility.

Note that the sequence of characters `(+ 2 6)' is not an external representation of the integer 8, even though it is an expression that evaluates to the integer 8; rather, it is an external representation of a three-element list, the elements of which are the symbol + and the integers 2 and 6. Scheme's syntax has the property that any sequence of characters that is an expression is also the external representation of some object. This can lead to confusion, since it may not be obvious out of context whether a given sequence of characters is intended to denote data or program, but it is also a source of power, since it facilitates writing programs such as interpreters and compilers that treat programs as data or data as programs.

Disjointness of Types

Every object satisfies at most one of the following predicates (but see section True and False, for an exception):

bit-string?     environment?    port?           symbol?
boolean?        null?           procedure?      vector?
cell?           number?         promise?        weak-pair?
char?           pair?           string?
condition?

Storage Model

This section describes a model that can be used to understand Scheme's use of storage.

Variables and objects such as pairs, vectors, and strings implicitly denote locations or sequences of locations. A string, for example, denotes as many locations as there are characters in the string. (These locations need not correspond to a full machine word.) A new value may be stored into one of these locations using the string-set! procedure, but the string continues to denote the same locations as before. An object fetched from a location, by a variable reference or by a procedure such as car, vector-ref, or string-ref, is equivalent in the sense of eqv? to the object last stored in the location before the fetch.

Every location is marked to show whether it is in use. No variable or object ever refers to a location that is not in use. Whenever this document speaks of storage being allocated for a variable or object, what is meant is that an appropriate number of locations are chosen from the set of locations that are not in use, and the chosen locations are marked to indicate that they are now in use before the variable or object is made to denote them.

In many systems it is desirable for constants (i.e. the values of literal expressions) to reside in read-only memory. To express this, it is convenient to imagine that every object that denotes locations is associated with a flag telling whether that object is mutable or immutable. The constants and the strings returned by symbol->string are then the immutable objects, while all objects created by other procedures are mutable. It is an error to attempt to store a new value into a location that is denoted by an immutable object. Note that the MIT Scheme compiler takes advantage of this property to share constants, but that these constants are not immutable. Instead, two constants that are equal? may be eq? in compiled code.

Lexical Conventions

This section describes Scheme's lexical conventions.

Whitespace

Whitespace characters are spaces, newlines, tabs, and page breaks. Whitespace is used to improve the readability of your programs and to separate tokens from each other, when necessary. (A token is an indivisible lexical unit such as an identifier or number.) Whitespace is otherwise insignificant. Whitespace may occur between any two tokens, but not within a token. Whitespace may also occur inside a string, where it is significant.

Delimiters

All whitespace characters are delimiters. In addition, the following characters act as delimiters:

(  )  ;  "  '  `  |

Finally, these next characters act as delimiters, despite the fact that Scheme does not define any special meaning for them:

[  ]  {  }

For example, if the value of the variable name is "max":

(list"Hi"name(+ 1 2))                   =>  ("Hi" "max" 3)

Identifiers

An identifier is a sequence of one or more non-delimiter characters. Identifiers are used in several ways in Scheme programs:

Certain identifiers are reserved for use as syntactic keywords; they should not be used as variables (for a list of the initial syntactic keywords, see section Special Form Syntax).
Any identifier that is not a syntactic keyword can be used as a variable.
When an identifier appears as a literal or within a literal, it denotes a symbol.

Scheme accepts most of the identifiers that other programming languages allow. MIT Scheme allows all of the identifiers that standard Scheme does, plus many more.

MIT Scheme defines a potential identifier to be a sequence of non-delimiter characters that does not begin with either of the characters `#' or `,'. Any such sequence of characters that is not a syntactically valid number (see section Numbers) is considered to be a valid identifier. Note that, although it is legal for `#' and `,' to appear in an identifier (other than in the first character position), it is poor programming practice.

Here are some examples of identifiers:

lambda             q
list->vector       soup
+                  V17a
<=?                a34kTMNs
the-word-recursion-has-many-meanings

Uppercase and Lowercase

Scheme doesn't distinguish uppercase and lowercase forms of a letter except within character and string constants; in other words, Scheme is case-insensitive. For example, `Foo' is the same identifier as `FOO', and `#x1AB' is the same number as `#X1ab'. But `#\a' and `#\A' are different characters.

Naming Conventions

A predicate is a procedure that always returns a boolean value (#t or #f). By convention, predicates usually have names that end in `?'.

A mutation procedure is a procedure that alters a data structure. By convention, mutation procedures usually have names that end in `!'.

Comments

The beginning of a comment is indicated with a semicolon (;). Scheme ignores everything on a line in which a semicolon appears, from the semicolon until the end of the line. The entire comment, including the newline character that terminates it, is treated as whitespace.

An alternative form of comment (sometimes called an extended comment) begins with the characters `#|' and ends with the characters `|#'. This alternative form is an MIT Scheme extension. As with ordinary comments, all of the characters in an extended comment, including the leading `#|' and trailing `|#', are treated as whitespace. Comments of this form may extend over multiple lines, and additionally may be nested (unlike the comments of the programming language C, which have a similar syntax).

;;; This is a comment about the FACT procedure.  Scheme
;;; ignores all of this comment.  The FACT procedure computes
;;; the factorial of a non-negative integer.

#|
This is an extended comment.
Such comments are useful for commenting out code fragments.
|#

(define fact
  (lambda (n)
    (if (= n 0)                      ;This is another comment:
        1                            ;Base case: return 1
        (* n (fact (- n 1))))))

Additional Notations

The following list describes additional notations used in Scheme. See section Numbers, for a description of the notations used for numbers.

+ - .: The plus sign, minus sign, and period are used in numbers, and may also occur in an identifier. A delimited period (not occurring within a number or identifier) is used in the notation for pairs and to indicate a "rest" parameter in a formal parameter list (see section Lambda Expressions).
( ): Parentheses are used for grouping and to notate lists (see section Lists).
": The double quote delimits strings (see section Strings).
\: The backslash is used in the syntax for character constants (see section Characters) and as an escape character within string constants (see section Strings).
;: The semicolon starts a comment.
': The single quote indicates literal data; it suppresses evaluation (see section Quoting).
`: The backquote indicates almost-constant data (see section Quoting).
,: The comma is used in conjunction with the backquote (see section Quoting).
,@: A comma followed by an at-sign is used in conjunction with the backquote (see section Quoting).
#: The sharp (or pound) sign has different uses, depending on the character that immediately follows it:
#t #f: These character sequences denote the boolean constants (see section Booleans).
#\: This character sequence introduces a character constant (see section Characters).
#(: This character sequence introduces a vector constant (see section Vectors). A close parenthesis, `)', terminates a vector constant.
#e #i #b #o #d #l #s #x: These character sequences are used in the notation for numbers (see section Numbers).
#|: This character sequence introduces an extended comment. The comment is terminated by the sequence `|#'. This notation is an MIT Scheme extension.
#!: This character sequence is used to denote a small set of named constants. Currently there are only two of these, #!optional and #!rest, both of which are used in the lambda special form to mark certain parameters as being "optional" or "rest" parameters. This notation is an MIT Scheme extension.
#*: This character sequence introduces a bit string (see section Bit Strings). This notation is an MIT Scheme extension.
#[: This character sequence is used to denote objects that do not have a readable external representation (see section Custom Output). A close bracket, `]', terminates the object's notation. This notation is an MIT Scheme extension.
#@: This character sequence is a convenient shorthand used to refer to objects by their hash number (see section Custom Output). This notation is an MIT Scheme extension.
#=
##: These character sequences introduce a notation used to show circular structures in printed output, or to denote them in input. The notation works much like that in Common Lisp, and is an MIT Scheme extension.

Expressions

A Scheme expression is a construct that returns a value. An expression may be a literal, a variable reference, a special form, or a procedure call.

Literal Expressions

Literal constants may be written by using an external representation of the data. In general, the external representation must be quoted (see section Quoting); but some external representations can be used without quotation.

"abc"                                   =>  "abc"
145932                                  =>  145932
#t                                      =>  #t
#\a                                     =>  #\a

The external representation of numeric constants, string constants, character constants, and boolean constants evaluate to the constants themselves. Symbols, pairs, lists, and vectors require quoting.

Variable References

An expression consisting of an identifier (see section Identifiers) is a variable reference; the identifier is the name of the variable being referenced. The value of the variable reference is the value stored in the location to which the variable is bound. An error is signalled if the referenced variable is unbound or unassigned.

(define x 28)
x                                       =>  28

Special Form Syntax

(keyword component ...)

A parenthesized expression that starts with a syntactic keyword is a special form. Each special form has its own syntax, which is described later in the manual. The following list contains all of the syntactic keywords that are defined when MIT Scheme is initialized:

access                  define-syntax           macro
and                     delay                   make-environment
begin                   do                      named-lambda
bkpt                    fluid-let               or
case                    if                      quasiquote
cond                    in-package              quote
cons-stream             lambda                  scode-quote
declare                 let                     sequence
default-object?         let*                    set!
define                  let-syntax              the-environment
define-integrable       letrec                  unassigned?
define-macro            local-declare           using-syntax
define-structure

Procedure Call Syntax

(operator operand ...)

A procedure call is written by simply enclosing in parentheses expressions for the procedure to be called (the operator) and the arguments to be passed to it (the operands). The operator and operand expressions are evaluated and the resulting procedure is passed the resulting arguments. See section Lambda Expressions, for a more complete description of this.

Another name for the procedure call expression is combination. This word is more specific in that it always refers to the expression; "procedure call" sometimes refers to the process of calling a procedure.

Unlike some other dialects of Lisp, Scheme always evaluates the operator expression and the operand expressions with the same evaluation rules, and the order of evaluation is unspecified.

(+ 3 4)                                 =>  7
((if #f = *) 3 4)                       =>  12

A number of procedures are available as the values of variables in the initial environment; for example, the addition and multiplication procedures in the above examples are the values of the variables + and *. New procedures are created by evaluating lambda expressions.

If the operator is a syntactic keyword, then the expression is not treated as a procedure call: it is a special form. Thus you should not use syntactic keywords as procedure names. If you were to bind one of these keywords to a procedure, you would have to use apply to call the procedure. MIT Scheme signals an error when such a binding is attempted.

Go to the first, previous, next, last section, table of contents.