[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Topes - a proposal

The recent discussion in comp.lang.scheme about the special-ness of set!
and the like made me decide to propose an old idea of mine.  True to the
"not piling feature on top of feature" principle, I introduce the orthogonal
notion of "topes".

In short, a tope is a reified mutable location.

Adding topes to Scheme, together with call-by-reference, has the following

- existing code remains valid (this is very important to me!):
  Someone who doesn't use the new constructs need not know they are there.
- simpler semantics
  - substitution semantics hold
  - fewer special forms (set! becomes a procedure)
  - fewer primitive procedures (set-car!, set-cdr!, vector-set!
    can be defined in terms of set!)
- more self-documenting code (if something can change, the code tells so)
- more efficient, since many things will be known to be constant:
  - more compile time evaluation is possible
  - the garbage collector can cdr-code and perform other tricks
- more errors caught: it is impossible to change something
  that was not designed to change
- richer system
  - constants
  - read-only objects
  - full setf functionality for free: "(set! (car a) 4)" works as expected
  - closures can share a box without sharing context (and e.g. trade boxes)
  - indirections (nice for implementing e.g. Prolog)
- less need for
  - user macros (e.g. pop, push can be procedures)
  - returning cells (as in member, assoc, last-pair, ...)

Since I am not very good in explaining, I will add a small interpreter that
works with topes, but let me try to be a little bit more precise in words as

A tope is an object that can contain a pointer, and set! is a procedure
that takes a tope and an object, and mutates the tope to contain a pointer
to the object.

(tope arg)    - returns a new tope, pointing to "arg"
(untope tope) - returns the object pointed to by the tope "tope"
(tope? arg)   - tests whether "arg" is a tope
(define (untope1 arg) (if (tope? arg) (untope arg) arg))
The latter procedure is useful internally.

Nothing else is mutable, so variables (both top level and lambda bound)
become constants.  In order to preserve backward compatibility, variants of
lambda and define are introduced: "func" takes the role of lambda, and "def"
that of define.  Then lambda and define are defined as macros as follows:

(lambda (arg1 ...) ..body..) -->
   (func args (apply (func (arg1 ...) ..body..)
                     (map (lambda (arg) (tope (untope1 arg))) args)))

(define var val) --> (def var (tope val))

let and letrec will be defined similarly.
[[In my interpreter, these constructs simply don't exist, for lack of

Also, "pair" is the pure version of cons, and
(define (cons a d) (pair (tope a) (tope d))).
An intermediate version (define (pons a d) (pair (tope a) d)) allows for
cdr-codable mutable lists.  Rest arguments could be turned into such lists,
or into immutable lists.  Of course, having lists that are guaranteed to be
cdr-coded allows for faster algorithms at some places.

Output is in general untope1-ed at all levels before printing, as are
arguments to primitive functions like + and substring.

WELL - these notions seem so obvious, that I cannot believe I am the first
person to think of this.  That again means, that there must be a serious
flaw somewhere, otherwise they would already have been adopted.  So please,
shoot it down, so that I will know what is wrong with it.

Oh, the error handling in my interpreter is incorrect.  I know this. 
The technique works correctly with more machine-oriented versions, though,
and I just didn't feel like catching a real continuation to jump to. 
Actually, I don't know how to catch a top level continuation from a loaded
file: As long as the file is loading, I am not at the top level, and there
are too many implementations that are not properly tail recursive when it
comes to loading files (otherwise I could have caught the continuation
with the last statement in the file..)

As an aside:

- An interesting notion for topes is the sigma-construct.  It is like a
lambda, but where a lambda adds a binding, a sigma changes a tope - outside
the sigma, the old value holds, but within the scope of the sigma the new
value holds.  This takes away much of the need for dynamic variables,
in a clean and "functionaloid" way.  Of course one can call/cc in and out
of applied sigmas like one can with lambdas.

(def report-to 
  (sigma (current-output-port)
    (print report)))

(report-to screen)

I have not included this in my enclosed interpreter, by the way.

                                           J. A. "Biep" Durieux

(define (horror . data)    ; incorrect error handler
  (*apply () `("aborted with ERROR:" . ,data)))

(define (? var env)        ; Find a variable in the environment
  (cdr (or (assoc var env) (horror "Variable unknown:" var))))

(define (! vars vals env)  ; Doesn't check for repeated var names, etc.

  (define (!! a d vals env)
    (if (pair? vals) (! a (car vals) (! d (cdr vals) env))
        (horror "Too few arguments:" vars vals)))

  (cond ((null? vars)   (if (null? vals) env
                            (horror "Too many arguments:" vals)))
        ((symbol? vars) (cons (cons vars vals) env))
        ((pair? vars)   (!! (car vars) (cdr vars) (detope vals) env))
        (else           (horror "Illegal identifier:" vars))))

(define (detope arg)       ; Make sure an object is no tope.
  (if (vector? arg) (detope (vector-ref arg 0)) arg))

(define (*eval expr env cc)
  (newline) (for-each (lambda (dummy) (display "|  ")) cc) (display expr)
  (cond ((symbol? expr) (*apply cc (? expr env)))        ; variable
        ((pair? expr)
         (*eval-pair (car expr) (cdr expr) env cc))      ; non-trivial expr
        (else (*apply cc expr))))                        ; self-evaluating

(define (*eval-pair token args env cc)       ; special forms and applications
  (define ($ n) (list-ref args (- n 1)))     ; argument reference
  (define (o F) (if (vector? F) (horror)) (cons F cc))   ; funct. comb. w. cc
  (case token
    ((quote) (*apply cc ($ 1)))                          ; constant
    ((func)  (*apply cc                                  ; closure
                     `(proc ,($ 1) ,($ 2) ,env)))        ; (proc vars b env)
    ((if)    (*eval ($ 1) env                            ; conditional
                    (o `(if ,($ 2) ,($ 3) ,env))))       ; (if then else env)
    (else    (*eval token env                            ; application
                    (o `(appl ,args ,env))))))           ; (appl args env)

(define (*apply cc value)          ; application of the cc to the current value
  (if (null? cc) (begin (newline) value)                 ; computation finished
      (*exec (caar cc) (cdar cc) value (cdr cc))))       ; process next step

(define (*exec opcode args values cc)
  (define ($ n) (list-ref args (- n 1)))
  (define (o F) (if (vector? F) (horror)) (cons F cc))
  (case opcode
    ((if)   (*eval ($ (if values 1 2)) ($ 3) cc))        ; eval relevant part
    ((appl) (evlist () ($ 1) ($ 2) (o (detope values)))) ; prepare eval of args
    ((args) (evlist `(,values . ,($ 1)) ($ 2) ($ 3) cc)) ; eval of args
    ((proc) (*eval ($ 2) (! ($ 1) values ($ 3)) cc))     ; bind+eval b in env
    ((prim) (*apply-primitive (car args) (cdr args) values cc)) ; prim. proc.
    ((cont) (*apply args (car values)))                  ; call/cc
    (else   (horror "Unknown opcode:" opcode args values))))

(define (evlist done to-do env cc) ; The good old one, but now tail-efficient.
  (define (o F) (cons F cc))
  (if (null? to-do) (*apply cc (reverse done))
      (*eval (car to-do) env (o `(args ,done ,(cdr to-do) ,env)))))

(define (*apply-primitive proc args arglist cc) ; execute a primitive procedure
  (define (o F) (cons F cc))
  (cond ((procedure? proc)   (*apply cc (apply proc arglist))) ; cheap solution
        ((eq? proc 'call/cc) (*apply (o (car arglist)) `((cont . ,cc))))
        (else                (horror "Unknown primitive:" proc args arglist))))

; A lot of space efficiency can be gained by making (o F) more intelligent.
; That way, many non-tail-recursive procedures can still be tail efficient.

(define (e x) (*eval x environment ()))
(define (def var val)
  (set! environment (cons (cons var (e val)) environment))

(define (rep)                           ; YEP, the READ-EVAL-PRINT-loop
   (let ((expr (read)))
      (pp (detope (*eval expr environment ()))) ; we want a deep-detope

(define environment `(
  (call/cc prim  call/cc                                          )
  (def     prim  ,(lambda (var val)
                    (set! environment 
                          (cons (cons var val) environment))
                    var)                                          ) ; for the REP
  (pair    prim ,cons                                             )
  (eq?     prim ,(lambda (a1 a2) (eq? (detope a1) (detope a2)))   )
  (null?   prim ,(lambda (arg) (null? (detope arg)))              )
  (zero?   prim ,(lambda (arg) (zero? (detope arg)))              )
  (+       prim ,(lambda (a1 a2) (+ (detope a1) (detope a2)))     )
  (-       prim ,(lambda (a1 a2) (- (detope a1) (detope a2)))     )
  (display prim ,(lambda (arg) (display (detope arg)))            )
  (tope    prim ,(lambda (arg) (make-vector 1 arg))               )
  (untope  prim ,(lambda (arg) (vector-ref arg 0))                )
  (tope?   prim ,vector?                                          )
  (set!    prim ,(lambda (tope val)
                   (if (vector? tope) (vector-set! tope 0 val)
                       (horror "Illegal argument to set:" tope))) )
)                    )

(def 'untope1 '(func (arg) (if (tope? arg) (untope arg) arg)))
(def 'detope '(func (arg)
                (if (tope? arg) (detope (untope arg)) arg)))
(def 'car '(func ((a . d)) a))
(def 'cdr '(func ((a . d)) d))
(def 'pons '(func (a d) (pair (tope (untope1 a)) d)))    ; cdr-codeable
(def 'cons '(func (a d) (pons a (tope (untope1 d)))))

(def 'append
  '((func (append)
         (set! append (func (l1 l2)
                        (if (null? l1) l2
                            (cons (car l1) (append (cdr l1) l2))))))
    (tope 'dummy)))

(def 'length*
  '((func (length)
         (set! length (func (list num)
                        (if (null? list) num
                            (length (cdr list) (+ num 1))))))
    (tope 'dummy)))

(def 'length '(func (list) (length* list 0)))

(def 'loop-y '(func (f) ((func (g) (f (g g))) (func (g) (f (g g))))))

(def 'y '(func (f) ((func (h) (h h))
                         (func (g) (f (func (x) ((g g) x)))))))

(def 'f '(func (rec) (func (x) (if (zero? x) 0 (+ x (rec (- x 1)))))))
(display "Type \"(rep)\" for a REP-loop.
Don't forget to quote the first argument to \"def\" when in the REP-loop!")