[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Numbers Committee Report





Sorry it took so long, but I was afraid to do a half-assed job.  The following
report raises 10^10 issues which should be thought about by all of us.  I 
believe (after a bit of study) that other language people have been 
inadequately careful about the status of numbers in their languages.  Numbers
are probably the most complicated data structures we ever deal with, so we must
try to do a good job on them.

The report is entirely the responsibility of GJS (so throw stones at him), 
though Bill Rozas, Jon Rees, Guy Steele, Chris Hanson, Hal Abelson, Yekta
Gursel, and Chris Lindblad have made significant contributions -- most of
their stones have been thrown and absorbed (some hurt!).


                            Scheme Numbers


Intent

    Numerical computation has traditionally been neglected by the Lisp
community.  Until Common Lisp there has been no carefully thought out
strategy for organizing numerical computation; and with the exception
of the MacLisp system there has been little effort to efficiently
execute numerical code.  We applaud the excellent work of the Common
Lisp committee and we accept many of their recommendations.  In some
ways we simplify and generalize their proposals in a manner consistent
with the purposes of Scheme.

Scheme numerical operations treat numbers as abstract data, as
independent of the machine representation as is possible.  Thus, the
casual user should be able to write simple programs without having to
know that the implementation may use fixed-point, floating-point, and
perhaps other representations for his data.  Unfortunately, this
illusion of uniformity can only be sustained approximately -- the
implementation of numbers will leak out of our abstraction whenever
the user must be in control of precision, or accuracy, or when he must
construct especially efficient computations.  Thus we also must
provide escape mechanisms so that a sophisticated programmer can
exercise more control of the execution of his code and the
representation of his data when it is essential to do so.

We separate out several apparently related issues of representation --
the abstract numbers, their machine representation, and the
input/output formats of the symbolic expressions which denote
numerical constants.  We will use mathematical words such as NUMBER,
COMPLEX, REAL, RATIONAL, and INTEGER for properties of the abstract
numbers, data-structure names such as FIXNUM, BIGNUM, RATNUM, and
FLONUM for machine representations, and names like INT, FIX, FLO, SCI,
RAT, POLAR, and RECT for input/output formats.


Numbers

    A Scheme system provides data of type NUMBER, which is the most
general numerical type supported by that system.  NUMBER is likely to
be a complicated union type implemented in terms of FIXNUMs, BIGNUMS,
FLONUMS, etc. but this should not be apparent to a naive user.  What
the user sees is that the obvious operations on numbers produce the
mathematically expected results, within the limits of the
implementation.  Thus if the user divides 3 by 2, he should get
something like 1.5 (or, the exact fraction 3/2).  If he adds that
result to itself, and if the implementation can swing it, he should
get an exact 3.

Mathematically, numbers may be arranged into a tower of subtypes with
natural projections and injections relating adjacent members of the
tower:

                                NUMBER
                                COMPLEX
                                REAL
                                RATIONAL
                                INTEGER

We impose a uniform rule of downward coercion -- a number of one type
is also of a lower type if the injection (up) of the projection (down)
of a number leaves the number unchanged.  Since this tower is a real
mathematical entity, Scheme provides predicates and procedures that
access the tower.

Not all Scheme implementations must provide the whole tower, but they
are required to implement a coherent subset, consistent with the
purposes of Scheme.


Exactness

    Numbers are either EXACT or INEXACT.  A number is exact if it was
derived from EXACT numbers using only EXACT operations.  A number is
INEXACT if it models a quantity known only approximately, if it was
derived using INEXACT ingredients, or if it was derived using INEXACT
operations.  Thus INEXACTness is a contagious property of a number.
Some operations, such as the square root (of non-square numbers) must
be INEXACT because of the finite precision of our representations.
Other operations are inexact because of implementation requirements.
We emphasize that exactness is independent of the position of the
number on the tower.  It is perfectly possible to have an INEXACT
INTEGER or an EXACT REAL; 355/113 may be an EXACT RATIONAL or it may
be an INEXACT RATIONAL approximation to pi, depending on the
application.

Operationally, it is the system's responsibility to combine EXACT
numbers using exact methods, such as infinite precision integer and
rational arithmetic, where possible.  An implementation may not be
able to do this (if it does not use infinite precision integers and
rationals), but if a number becomes inexact for implementation reasons
there is probably an important error condition, such as integer
overflow, to be reported.  Arithmetic on INEXACT numbers is not so
constrained.  The system may use floating point and other flaky
representation strategies for INEXACT numbers.  This is not to say
that implementors need not use the best known algorithms for INEXACT
computations -- only that high-quality approximate methods are
allowed.  In a system which cannot explicitly distinguish exact from inexact numbers the system must do its best to
maintain precision.  Scheme systems must not burden users with
numerical operations described in terms of hardware and
operating-system dependent representations such as FIXNUM and FLONUM.
These representation issues should not be germane to the user's
problems.

We highly recommend that the IEEE 32-bit and 64-bit floating-point
standards be adopted for implementations that use floating-point
internal representations.  To minimize loss of precision we adopt the
following rules: If an implementation uses several different size
floating-point formats, the results of any operation with a
floating-point result must be expressed in the largest format used to
express any of the floating-point arguments to that operation.  It is
desirable (but not required) for potentially irrational operations
such as SQRT, when applied to EXACT arguments, to produce EXACT
answers when that is possible (eg.  sqrt[4]=2, exactly).  If an EXACT
number (or an INEXACT number represented as a FIXNUM, a BIGNUM, or a
RATNUM) is operated upon so as to produce an INEXACT result (as by
SQRT), then if the result is represented as a FLONUM, then the largest
available format FLONUM must be used; but if the result is expressed
as a RATNUM then the rational approximation must have at least as much
precision as the largest available format FLONUM.  (Note that this
last rule is much stronger than that used in Common Lisp.  Consider
the result of the square root of a rational, for example.)


Numerical Operations

    Scheme provides the usual set of operations for manipulation of
numbers.  In the specification below we illustrate each built in
numerical operation with an expression using the default operator
symbol which is bound to that operation in the global environment.  In
general, numerical operations require numerical arguments.  We use the
following symbols to represent the various types of object in our
descriptions.  It is an error for an operation to be presented with an
argument that it is not specified to handle.

obj				any arbitrary object
z, z1, ... zi, ...		{complex, real, rational, integer} x,
x1, ... xi, ...		{real, rational, integer}
q, q1, ... qi, ...		{rational, integer} n, n1, ... ni, ...
{integer}

In our descriptions some arguments are required and others are
optionally allowed.  For example, numerical comparisons are allowed to
have any number of arguments, but they must have at least two.  We
indicate this by putting in explicit argument symbols for each
required argument and a "..." indicating the rest.

A.  Numerical type predicates can be applied to any kind of argument.
They return true if the object is of the named type.  In general, if a
type predicate is true of a number then all higher type predicates are
also true of that number.  Not every system supports all of these
types.  It is entirely possible to have a Scheme system which only has
INTEGERs, however, it must have all of these predicates.

(NUMBER? obj) (COMPLEX? obj) (REAL? obj) (RATIONAL? obj) (INTEGER?
obj)


B.  Numerical predicates test a number for a particular property.
They return a boolean value.

(ZERO? z) (POSITIVE? x) (NEGATIVE? x) (ODD? n) (EVEN? n)

(EXACT? z) (INEXACT? z)


C.  Numerical comparisons require all of their arguments to be
numbers.  They optionally take many arguments, as in Common Lisp, to
facilitate encoding of range checks.  Not all implementations support
this extension.  These predicates have redundant names (with and
without the terminal "?") to make all populations happy.  Warning: On
INEXACT numbers equality tests will give unreliable results; other
numerical comparisons are only heuristically useful (ask a numerical
analyst about this!)

(= z1 z2 ... zi)          all arguments are numerically equal
(=? z1 z2 ... zi)            

(< x1 x2 ... xi)          arguments are monotonically increasing
(<? x1 x2 ... xi)

(> x1 x2 ... xi)          arguments are monotonically decreasing
(>? x1 x2 ... xi)

(<= x1 x2 ... xi)         arguments are monotonically nondecreasing
(<=? x1 x2 ... xi)

(>= x1 x2 ... xi)         arguments are monotonically nonincreasing
(>=? x1 x2 ... xi)


(max x1 ... xi)
(min x1 ... xi)

D. Arithmetic operations require all arguments to be numbers.
Generalization to arbitrary numbers of arguments is implementation
dependent.

(+ z1 ... zi)            ==> z1 + ... + zi
   (+ z)                    ==> z
   (+)                      ==> 0     (the identity)

(* z1 ... zi)            ==> z1 * ... * zi
   (* z)                    ==> z
   (*)                      ==> 1     (the identity)

(- z1 ... zi)            ==> z1 - z2 - ... - zi
   (- z)                    ==> -n

(/ z1 ... zi)            ==> z1 / z2 / ... / zi
   (/ z)                    ==> 1/z

(-1+ z)                  ==> -1 + z
(1+ z)                   ==> 1 + z

(abs z)


E.  Integer division operations are only defined on integers.

(quotient n1 n2)           ==> n3
(remainder n1 n2)          ==> n4

(modulo n1 n2)             ==> n4

In general, these are intended to implement number-theoretic division,
where for positive arguments, n1 and n2

                   n1 = n3*n2 + n4 and 0 <= n4 < d.

REMAINDER and MODULO differ on negative arguments as in the Common
Lisp REM and MOD procedures -- the remainder always has the sign of
the dividend, the modulo always has the sign of the divisor:

(modulo 13 4)   ==>  1         (remainder 13 4)   ==>  1
(modulo -13 4)  ==>  3         (remainder -13 4)  ==> -1
(modulo 13 -4)  ==> -3         (remainder 13 -4)  ==>  1
(modulo -13 -4) ==> -1         (remainder -13 -4) ==> -1


F.  The following procedures are also only defined on integers.  The
result is always non-negative.

(gcd n1 ... ni)              Greatest common divisor
			       (gcd) ==> 0     (the identity)

(lcm n1 ... ni)              Least common multiple
			       (lcm) ==> 1     (the identity)

G.  Integers and rationals can be made using the following procedures.
Note, this does not make the answer EXACT -- in fact, the resulting
number is clearly INEXACT, it can be made EXACT with an exactness
coercion.

(floor x)     The largest integer not larger than x
(ceiling x)   The smallest integer not smaller than x
(truncate x)  The integer of maximal absolute value not larger than x
(round x)     The closest integer to n.  Rounds to even when halfway.

(rationalize x y)     Produces the best rational approximation to x 
                       within the tolerance specified by y.
   (rationalize x)    Produces the best rational approximation to x,
                       preserving all of the precision in its 
                       representation.

H.  In implementations which support real numbers the following are
defined to conform with the Common Lisp standard as to meaning (be
careful of the branch cuts if complex numbers are allowed).

(exp z) (log z)   (expt z1 z2) (sqrt z)

(sin z) (cos z) (tan z)   (asin z) (acos z) (atan z1 z2)           


I.  In implementations which support complex numbers we also provide

(make-rectangular x1 x2)      ==> x1 + x2i
(make-polar x3 x4)            ==> x3*e^x4i

(real-part z) (imag-part z)   (magnitude z) (angle z)


J.  Exactness coercions.

(exact->inexact z)         Makes an inexact representation of z.
                              Pretty harmless.
(inexact->exact z)         Makes an exact representation of z.
                              I hope you know what you are doing here!


K.  Type-restricted operations are useful for error checking, or as
advice to a compiler.  Scheme optionally supplies such type-restricted
operations: If op is defined on objects of numerical type, t1, and if
t2 is any type lower on the tower than t1, then (t2 op) is the
restricted operator for objects of type t2.  For example, the
following are expressions using restricted operators:

((real sqrt) 5)		; Gets the right thing.
((real sqrt) -1)	; Will signal an error.
((integer sqrt) 10)	; Will signal an error -- You probably 
                           wanted to compute (truncate ((real sqrt) 10))

For completeness, the system also supplies the most general operations

((number +) 3 4) 

The default operations, initially bound to the customary operators in
the global environment, are these most general operations.  The
type-restriction procedures REAL, INTEGER, ... are simple maps which
map the default operations to their restricted cousins.

Some systems may want to enhance this facility, to allow integer
subranges or other mathematically meaningful restrictions such as
modular arithmetic, for example:

(((integer-subrange 3 5) +) 2 3 4)   ;Will signal an error.


Numerical Input and Output

    Scheme allows all of the traditional ways of expressing numerical
constants, though only some may be supported by any particular
implementation.  These expressions are intended to be purely
notational; any kind of number may be expressed in any form that the
user deems convenient.  Of course, expressing 1/7 as a
limited-precision decimal fraction will not exactly express the
number, but this approximate expression may be just what the user
wants to see.

The expressions of numerical constants can be specified using formats.
The system provides a procedure, NUMBER->STRING, which takes a number
and a format and which produces a string which is the printed
expression of the given number in the given format.

(number->string <number> <format>)   ==>  <string>

This procedure will mostly be used by sophisticated users and in
system programs.  In general, a naive user will need to know nothing
about the formats because the system printer will have reasonable
default formats for all types of NUMBERs.  The system reader will
construct reasonable default numerical types for numbers expressed in
each of the formats it recognizes.  If a user needs control of the
coercion from strings to numbers he will use STRING->NUMBER, which
takes a string, an exactness, and a radix and which produces a number
of the maximally precise applicable type expressed by the given
string.

(string->number <string> E|I B|O|D|X)  ==>  <number>

Formats may have parameters.  For example, the "(SCI 5 2)" format
specifies that a number is to be expressed in FORTRAN scientific
format with 5 significant places and two places after the radix point.

The following are all numerical constant expressions.  The comment
shows the format that was used to produce the expression:

123  +123  -123                    ; (INT) format
12345678901234567890123456789      ; A big one
355/113  -355/113  +355/113	   ; (RAT) format
+123.45  -123.45                   ; (FIX 2) format
3.14159265358979		   ; (FIX 14) format
3.14159265358979                   ; (FLO 15) format
123.450                            ; (FLO 6) format
-123.45E-1                         ; (SCI 5 2) format
123E3  123E-3  -123E-3             ; (SCI 3 0) format
-1+2i                              ; (RECT (INT) (INT)) format
1.2@1.570796                       ; (POLAR (FIX 1) (FLO 7)) format

A numerical constant may be specified with an explicit radix by a
prefix.  The prefixes are: #B (binary), #O (octal), #D (decimal), #X
(hex).  A format may specify that a number be expressed in a
particular radix.  The radix prefix may be suppressed.  For example,
one may express a complex number in polar form with the magnitude in
octal and the angle in decimal as follows:

#o1.2@#d1.570796327  ;(POLAR (FLO 2 (RADIX O)) (FLO (RADIX D))) format
#o1.2@1.570796327    ;(POLAR (FLO 2 (RADIX O)) (FLO (RADIX D S)) format

A numerical constant may be specified to be either EXACT or INEXACT by
a prefix.  The prefixes are: #I (inexact), #E (exact).  An exactness
prefix may appear before or after any radix prefix that is used.  A
format may specify that a number be expressed with an explicit
exactness prefix, or it may force the exactness to be suppressed.  For
example, the following are ways to output an inexact value for pi:

#I355/113           ; (RAT (EXACTNESS))
355/113             ; (RAT (EXACTNESS S))
#I3.1416            ; (FIX 4 (EXACTNESS))

An attempt to produce more digits than are available in the internal
machine representation of a number will be marked with a "#" filling
the extra digits.  This is not a statement that the implementation
knows or keeps track of the significance of a number, just that the
machine will flag attempts to produce 20 digits of a number which has
only 15 digits of machine representation:

3.14158265358979#####       ; (FLO 20 (EXACTNESS S))

In systems with both single and double precision FLONUMs one may want
to specify which size we want to use to internally represent a
constant.  For example, we may want a constant whidh is pi rounded to

the single precision length, or we might want a long number which has
the value 6/10.  In either case, we are specifying an explicit way to
represent an INEXACT number.  For this purpose, we allow one to
express a number with a prefix which indicates short or long FLONUM
representation:

#S3.14159265358979          ; Round to short -- 3.141593
#L.6                        ; Extend to long -- .600000000000000



Details of formats

    The format of a number is notated as a list beginning with a
format descriptor, such as SCI.  Following the descriptor are
parameters used by that descriptor, such as the number of significant
digits to be used.  Parameters which are omitted are defaulted.  Next,
one may specify modifiers, such as RADIX or EXACTNESS, which
themselves may be parameterized.  The format descriptors are:

(INT)
   Express as an integer.  The radix point is implicit.  If there are
   not enough significant places, as in trying to express 6.0238E23
   (internally represented as a 7 digit FLONUM) as an integer we would
   get "6023800################".

(RAT n)
   Express as a rational fraction.  n specifies the largest
   denominator to be used in constructing a rational approximation to
   the number being expressed.  If n is omitted it defaults to
   infinity. 

(FIX n)
   Express with a fixed radix point.  n specifies the number of
   places to right of the radix point.  n defaults to the size of a
   single-precision FLONUM.  If there are not enough significant
   places, as in trying to express 6.0238E23 (internally represented
   as a 7 digit FLONUM) as a (FIX 2) we would get
   "6023800################.##".

(FLO n) 
   Express with a floating radix point.  n specifies the total number
   of places to be displayed.  n defaults to the size of a single-
   precision FLONUM.  If the number is out of range, it is converted
   to (SCI).  (FLO H) allows the system to heuristically express a FLO
   for human consumption (as in MacLisp printer).

(SCI n m)
   Express in exponential notation.  n specifies the total number of
   places to be displayed.  n defaults to the size of a single-
   precision FLONUM.  m specifies the number of places to the right of
   the radix point.  m defaults to n-1.  (SCI H) does heuristic
   expression. 

(RECT r i)
   Express as a rectangular form complex number.  r and i are formats
   for the real and imaginary parts respectively.  They default to
   (HEUR).

(POLAR m a)
   Express as a polar form complex number.  m and a are formats for
   the magnitude and angle respectively.  m and a default to (HEUR).

(HEUR)
   Express heuristically, as in the MacLisp printer (see Steele),
   using the minimum number of digits required to get an expression
   which when coerced back to a number produces the original machine
   representation.  EXACT numbers are expressed as (INT) or (RAT).
   INEXACT numbers are expressed as (FLO H) or (SCI H) depending on
   their range.  Complex numbers are expressed in (RECT).  This is the
   normal default of the system printer.


The following modifiers may be added to a numerical format
specification:

(EXACTNESS s)
   This controls the expression of the exactness label of a number.  s
   indicates whether the exactness is to be E (expressed) or S
   (suppressed).  s defaults to E.  If no exactness modifier is
   specified for a format the exactness is by default not expressed.

(RADIX r s)
   This forces a number to be expressed in the radix r.  r may be B
   (binary), O (octal), D (decimal), or X (hex).  s indicates whether
   the radix label is to be E (expressed) or S (suppressed).  s
   defaults to E.  If no radix modifier is specified the default is 
   decimal and the label is suppressed.
	
-------