Smart Data Structures
As multicore processors become prevalent, the complexity of programming is skyrocketing. One major difficulty is efficiently orchestrating collaboration among threads through shared data structures. Unfortunately, choosing and hand-tuning data structure algorithms to get good performance across a variety of machines and inputs is a herculean task to add to the fundamental difficulty of getting a parallel program correct. To help mitigate these complexities, we have developed a new class of parallel data structures called Smart Data Structures that leverage online machine learning to adapt automatically. We have prototyped and evaluated an open source library of Smart Data Structures for common parallel programming needs and demonstrated significant improvements over the best existing algorithms under a variety of conditions. Our results indicate that learning is a promising technique for balancing and adapting to complex, time-varying tradeoffs and achieving the best performance available.
As illustrated above, standard parallel data structures consist of data storage, an interface, and algorithms. The storage organizes the data, the interface specifies the operations threads may apply to the data to manipulate it, and the algorithms implement the interfaces while preserving the correct concurrent semantics. Storage and algorithms are often controlled by knobs: the thresholds or other parameters that program implementation behaviors and heuristics. Knobs are typically configured via one-size-fits-all static defaults provided by the library programmer. When the defaults perform sub-optimally, programmers must hand-tune the knobs; typically, they do so through trial and error and special cases in the code which increase complexity and reduce readability. Though often necessary, dynamic tuning of the knobs is typically beyond the reach of the programmer.
Smart Data Structures avoid the need for cumbersome hand-tuning and provide dynamic tuning by augmenting standard data structures with an online learning engine that automatically optimizes the knobs. Through learning, Smart Data Structures balance complex tradeoffs to find ideal knob settings and react to changes in the system or inputs that affect these tradeoffs.
- Best Student Paper, 7th IEEE/ACM International Conference on Autonomic Computing (ICAC’10), June 2010. (paper pdf)
- Smart Data Structures: An Online Machine Learning Approach to Multicore Data Structures, by Jonathan Eastep, David Wingate, and Anant Agarwal.
8th IEEE/ACM International Conference on Autonomic Computing (ICAC’11), 2011. (pdf)
- Smartlocks: Lock Acquisition Scheduling for Self-Aware Synchronization, by Jonathan Eastep, David Wingate, Marco D. Santambrogio and Anant Agarwal.
7th IEEE/ACM International Conference on Autonomic Computing (ICAC’10), 2010. (pdf)
- Smartlocks: Self-Aware Synchronization through Lock Acquisition Scheduling, by Jonathan Eastep, David Wingate, Marco D. Santambrogio and Anant Agarwal.
MIT CSAIL Technical Report, MIT-CSAIL-TR-2009-055, November 2009. (pdf)
Open Source Projects:
- Smart Data Structures Project. github.com/mit-carbon/Smart-Data-Structures
- Jonathan Eastep
- David Wingate
- Marco D. Santambrogio
- Anant Agarwal