The Cilk-M Project
From SuperTech Wiki
In a dynamically multithreaded (dthreaded for short) language such as Cilk, since multiple children of a function may exist simultaneously, the runtime system employs a cactus stack to support multiple stack views for all the active children simultaneously. In all known software implementations of cactus stacks, however, transitioning from serial code (using a linear stack) to parallel code (using a cactus stack) is problematic, because the type of stack impacts the calling conventions used to allocate activation frames and pass arguments. One could recompile the serial code to use a cactus stack, but this strategy is not feasible if the codebase includes legacy or third-party binaries for which the source is not available. We call the property of allowing arbitrary calling between parallel and serial code --- including especially legacy (and third-party) serial binaries --- serial-parallel reciprocity, or SP-reciprocity for short.
It turns out that there seems to be an inherent trade-off between supporting SP-reciprocity and maintaining good time and space bounds, and existing work-stealing concurrency platforms fail to satisfy at least one of these three criteria. We refer to the problem of simultaneously achieving all three criteria as the cactus-stack problem. Arch Robison, the main architect of Thread Building Blocks (Intel) and the current maintainer of Cilk Plus runtime system (Intel), calls the cactus-stack problem a "Grand Challenge," because the incompatibility of cactus stacks and linear stacks impedes the acceptance of dthreaded languages for mainstream computing.
Cilk-M solves the cactus-stack problem by providing a memory abstraction that allows each of the multiple extant child functions to see its own view of the linear stack, even though they share the same parent. Thus, a function can operate on the cactus stack like it is a linear stack.
The Cilk-M System
There are three major components to the Cilk-M system:
- TLMM-Linux: we modified the Linux operating system kernel to provide support for a new memory mechanism, called thread-local memory mapping (TLMM), which designates a region of the process's virtual-address space as "local" to each thread. This special TLMM region occupies the same virtual-address range for each thread, but each thread may map different physical pages to the region.
- Cilk-M runtime system: the Cilk-M runtime system implements a work-stealing scheduler, and maintains a cactus-stack memory abstraction using TLMM. With the cactus-stack memory abstraction, the Cilk-M runtime system allows transitioning between serial code and parallel code and provides full compatibility with existing serial calling conventions.
- Cilk-M compiler: we don't currently have our own compiler. We have a version of the runtime system that works with Intel's Cilk Plus compiler, however, which will allow users to compiler both Cilk and Cilk++ code.
The Cilk-M project is supported in part by the National Science Foundation under Grant CNS-1017058.