[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
DYNAMIC-WIND vs. multi-processing
Date: Wed, 12 May 93 15:09:19 HKT
From: shivers@csd.hku.hk
Except for top-level variables, CALL-WITH-CURRENT-CONTINUATION
captured the whole state of a computation.
I do not think so. Consider the file system. The characters you typed in from
your keyboard. The time of day. The program store, including non-"top level"
variables and arrays. This "captures all the state" notion is definitely not a
reason to defend CALL/CC. In real programs, there's a lot it doesn't capture.
You are correct about the file system. My sentence should be modified
to be "Except for ports and top-level variables, ..." In practice,
people don't expect copies of the file system to be spawned for each
process. So I don't think it affects the validity of my claim.
The program store is stored as the values of top-level variables. I
don't see how you can consider the time of day to be a stored
parameter. It is computed fresh on every call.
I also do not believe in the idea of "a" computational top-level. There never
is one, barring maybe the boot prom. Scheme programs do not have to have the
model of running inside a read-eval-print loop. So saying that you grab the
entire future of the computation back to the top level begs the question.
Which top level? You need operators to define "top level" with respect to a
particular computational point of view. I do not think deeply about these
things, but some people do: Felleisen, Danvy, and Filinski have written a lot
of papers dealing with these issues. So, again, I think CALL/CC comes up a
little short of being the right thing.
Say what? R4RS Section 5.2.1 is Top level definitions. Those are the
ones I am referring to. In a discussion on this group a while ago,
several people (including me) suggested changes which would have
eliminated differences in behaviour between top-level and internal
definitions. But there would still be a top level.
Generally, I think CALL/CC is not the way to do threads. For one, it allows
you to invoke a continuation multiple times, which never happens with a thread
continuation.
Because CWCC is more powerful than threads is not an argument that
CWCC cannout be used to implement threads.
For two, it is a serial model of concurrency, which does not
cut the mustard.
Because CWCC threads are "serial" does not preclude the existence of a
"non-serial" thread implementation.
...
this model doesn't handle true parallel execution. An example: suppose I
wanted my threads to each have their own Unix current working directory, so I
had DYNAMIC-WIND do chdir's going in and cwd's coming out. Then on every
thread switch, I could do an (expensive, slow) mutation of the shared process
state to put each thread in his own configuration of that state. But how do I
model running threads on a parallel processor, where more than one thread can
execute at once? CALL/CC and DYNAMIC-WIND don't handle this. I think a thread
model should have semantics that are independent of whether you are using a
uniprocessor or multiprocessor implementation.
Another contrived example: a thread corresponds to a vertex in some tree data
structure your program has built. Whenever you switch into a thread, you
re-root the tree so the resumed thread is the new root. OK, fine, DYNAMIC-WIND
will take care of you. BUT: what happens when you want to run your algorithm
on a parallel machine? It's just not a good model.
If you want your program to run on multiple-proccessor machines you
will have to protect shared resources with some sort of semaphome,
regardless of whether you use CWCC or not. A program which DOES
protect its shared resources will run on either true
multiple-proccessors or on an CWCC threads implementation.
I am NOT claiming that any program which uses CWCC can run on multiple
processors, I am claiming that R4RS scheme can implement threads and
that DYNAMIC-WIND can break this non-locally.
I think that it might be nice to recognise the unlimited extent of
CALL/CC-grabbed continuations, and really provide for a true cleanup-on-exit
feature with a procedure
(when-finished thunk cleanup-thunk)
This call returns the value of (THUNK). However, sometime after
WHEN-FINISHED's (and THUNK's) continuation is transferred to for the last
time, CLEANUP-THUNK will be run.
- On a normal return, with no retained inner continuations, this
is just like UNWIND-HANDLER
- If you retain a continuation so that you can return through
THUNK multiple times, you may have to delay running the cleanup
action until THUNK's continuation is gc'd.
So, in other words, you can safely arrange for some resource to be available
to the computation THUNK, and also know that it will be released when the
computation is finished. Garbage collection does not handle all resource
allocation problems -- you can't rely on it to free up temp files, for
example. Another example: WHEN-FINISHED provides a more general solution to
the CALL-WITH-INPUT-FILE problem to which you referred.
I believe that what you are suggesting can be implemented using
"finalizations" which were proposed a while ago.
If you do that, I am really running out of reasons to have DYNAMIC-WIND at
all.
I don't think you have addressed a way to implement dynamic binding.
Alternately, one might say that users should just avoid the use of
DYNAMIC-WIND if they want to multi-process.
If users want to multi-process, they should use a concurrent language,
which straight, simple R4RS Scheme is not. CALL/CC with or without
DYNAMIC-WIND is not the way to do concurrency.
All R4RS needs to implement concurrent programs is the addition
semaphores. There are even algorithms which implement this without
additional primitives. CWCC can be used to implement a threads
package. Therefore, R4RS IS a concurrent language!