David
Chaiken. Mechanisms and Interfaces for
Software-Extended Coherent Shared Memory.
Ph.D. thesis, Massachusetts Institute of
Technology, Department of Electrical Engineering and Computer Science,
September 1994. Also available as MIT/LCS Technical Report
644.
(pdf,
compressed
postscript)
Abstract:
Software-extended systems use a combination of hardware and software to implement shared memory on large-scale multiprocessors. Hardware mechanisms accelerate common-case accesses, while software handles exceptional events. In order to provide fast memory access, this design strategy requires appropriate hardware mechanisms including caches, location-independent addressing, limited directories, processor access to the network, and a memory-system interrupt. Software-extended systems benefit from the flexibility of software, but they require a well-designed interface between their hardware and software components to do so.
This dissertation proposes, designs, tests, measures, and models the novel software-extended memory system of Alewife, a large-scale multiprocessor architecture. A working Alewife machine validates the design, and detailed simulations of the architecture (with up to 256 processors) show the cost versus performance trade-offs involved in building distributed shared memory. The architecture with a five-pointer LimitLESS directory achieves between 71% and 100% of full-map directory performance at a constant cost per processing element.
A worker-set model uses a description of application behavior and architectural mechanisms to predict the performance of software-extended systems. The model shows that software-extended systems exhibit little sensitivity to trap latency and memory-system code efficiency, as long as they implement a minimum of one directory pointer in hardware. Low-cost, software-only directories with no hardware pointers are very sensitive to trap latency and code efficiency, even in systems that implement special optimizations for intranode accesses.
Alewife's flexible coherence interface facilitates the development of memory-system software and enables a smart memory system, which uses intelligence to help improve performance. This type of system uses information about applications' dynamic use of shared memory to optimize performance, with and without help from programmers. An automatic optimization technique transmits information about memory usage from the runtime system to the compiler. The compiler uses this information to optimize accesses to widely-shared, read-only data and improves one benchmark's performance by 22%. Other smart memory features include human-readable profiles of shared-memory accesses and protocols that adapt dynamically to memory reference patterns.
webmaster@cag.lcs.mit.edu $Date: 1998/01/06 16:49:48 $