MIT Stata Center

Invited Workshop on Compiler Techniques for Sparse Tensor Algebra

Cambridge MA, January 26th

Organizers: Fredrik Kjolstad and Saman Amarasinghe

There has been a lot of recent interest and innovation in compiler technology for high performance sparse tensor computation. This workshop will bring together relevant parties from academia and industry to discuss individual approaches, how they relate, and whether the ideas can be combined.


There will be seven session of three speakers. The sessions will consist of three ten minute talks followed by a twenty minute speaker and room discussion.
Time Session
8:00 - 8:20 Registration
8:20 - 8:30 Welcome
8:30 - 9:20 Sparse Algorithms and Data Structures I
Chaired by Saman Amarasinghe (MIT)
Jiajia Li (Pacific Northwest National Laboratory)
Shaden Smith (Intel Labs)
Edgar Solomonik (University of Illinois at Urbana-Champaign)
9:20 - 9:30 Break
9:30 - 10:20 Sparse Algorithms and Data Structures II
Chaired by Maryam Dehnavi (University of Toronto)
Muthu Baskaran (Reservoir Labs)
Jee Choi (University of Oregon)
Duane Merrill and Michael Garland (NVIDIA Research)
10:20 - 10:30 Break
10:30 - 11:20 Sparse Polyhedral Optimization
Chaired by Shoaib Kamil (Adobe Research)
Mary Hall (University of Utah)
Cathie Olschanowsky (Boise State University)
Michelle Strout (University of Arizona)
11:20 - 11:30 Break
11:30 - 12:30 Tensor Compilers I
Chaired by Stephen Chou (MIT)
Dougal Maclaurin (Google)
P. (Saday) Sadayappan (Ohio State University)
Hongbo Rong (Intel Labs)
Keno Fischer (Julia Computing)
12:30 - 1:30 Lunch
1:30 - 2:20 Tensor Compilers II
Chaired by Chris Fletcher (University of Illinois at Urbana-Champaign)
Maryam Dehnavi (University of Toronto)
Fredrik Kjolstad (MIT)
Stephen Chou (MIT)
2:20 - 2:35 Break
2:35 - 3:25 Tensors and Graphs
Chaired by Fredrik Kjolstad (MIT)
Justin Solomon (MIT)
Jeremy Kepner (Lincoln Laboratory)
Julian Shun (MIT)
3:25 - 3:40 Break
3:40 - 4:30 Hardware for Sparse Tensor Algebra
Chaired by Joel Emer (NVIDIA Research and MIT)
Michael Pellauer (NVIDIA Research)
Hanrui Wang (MIT)
Chris Fletcher (University of Illinois at Urbana-Champaign)
4:30 - 4:45 Break
4:45 - 5:35 Applications in Science and Engineering
Chaired by Peter Ahrens (MIT)
Srinivas Eswar (Georgia Tech)
Richard Mills (Argonne National Laboratory)
Miles Stoudenmire (Flatiron Institute)
5:35 - 6:00 Discussion
6:00 - 8:00 Dinner and Poster Session


The following is a list of papers that were suggested by workshop participants.
Run-time parallelization in a polyhedral framework
Automating wavefront parallelization for sparse matrix computations Anand Venkat, Mahdi Soltan Mohammadi, Jongsoo Park, Hongbo Rong, Rajkishore Barik, Michelle Mills Strout, and Mary Hall SC 2016
Uninterpreted functions in a polyhedral framework
Non-affine Extensions to Polyhedral Code Generation Anand Venkat, Manu Shantharam, Mary Hall, and Michelle Mills Strout CGO 2014
Data Transformations in a Polyhedral Framework
Loop and data transformations for sparse matrix code Anand Venkat, Mary Hall, and Michelle Strout PLDI 2015
Fast alternating LS algorithms for high order CANDECOMP/PARAFAC tensor factorizations. Anh-Huy Phan, Petr Tichavsky, and Andrzej Cichocki. Signal Processing 2013
Nesterov-based parallel algorithm for large-scale nonnegative tensor factorization. Athanasios P Liavas, Georgios Kostoulas, Georgios Lourakis, Kejun Huang, and Nicholas D Sidiropoulos. ICASSP 2017
Storage formats for tensor operations: GPUs
A Unified Optimization Approach for Sparse Tensor Operations on GPUs B. Liu, C. Wen, A. Sarwate, and M. Mehri Dehnavi Cluster
Sparse tensor optimizations for tensor decompisitons used for sparse count data
Memory-efficient parallel tensor decompositions Baskaran, M., Henretty, T., Pradelle, B., Langston, M. H., Bruns-Smith, D., Ezick, J., Lethin, R HPEC 2017
Mode-specific and mode-generic tensor formats in ENSIGN
Efficient and scalable computations with sparse tensors Baskaran, M., Meister, B., Vasilache, N., & Lethin, R HPEC 2012
The Formidable Repository of Open Sparse Tensors and Tools Bunch of people
Merge-based parallelism for load balancing
Merge-based Parallel Sparse Matrix-Vector Multiplication Duane Merill and Michael Garland SC 2016
Cyclops tensor framework (distributed memroy tensor algebra)
A massively parallel tensor contraction framework for coupled-cluster computations Edgar Solomonik, Devin Matthews, Jeff R. Hammond, John F. Stanton, and James Demmel JPDC 2014
Sparse tensor contractions (matrix multiplication) and graph algorithms with Cyclops
Scaling betweenness centrality using communication-efficient sparse matrix multiplication Edgar Solomonik, Maciej Besta, Flavio Vella, and Torsten Hoefler SC 2017
TACO system
The Tensor Algebra Compiler Fredrik Kjolstad, Shoaib Kamil, Stephen Chou, David Lugato, and Saman Amarasinghe OOPSLA 2017
Communication Lower bounds for MTTKRP
Communication lower bounds for matricized tensor times khatri-rao product. Grey Ballard, Nicholas Knight, and Kathryn Rouse. IPDPS 2018
Specify sparse computations for FPGAs and CGRAs
Expressing Sparse Matrix Computations for Productive Performance on Spatial Architectures Hongbo Rong
HBL upper bounds
Parallelepipeds obtaining HBL lower bounds. James Demmel and Alex Rusciano CoRR 2016
Cache blocking for Sparse Tensors
Blocking Optimization Strategies for Sparse Tensor Computation. Jee Choi, Xing Liu, Shaden Smith, and Tyler Simon IPDPS 2018
HiCOO format
HiCOO: Hierarchical Storage of Sparse Tensors Jiajia Li, Jimeng Sun, Richard Vuduc SC 2018
PASTA benchmark suite
PASTA: A Parallel Sparse Tensor Algorithm Benchmark Suite Jiajia Li, Yuchen Ma, Xiaolong Wu, Ang Li, Kevin Barker Technical Report 2019
graph processing frameworks
Ligra: A Lightweight Graph Processing Framework for Shared Memory Julian Shun and Guy Blelloch PPoPP 2013
Code generation for sparse matrix methods
ParSy: Inspection and Transformation of Sparse Matrix Computations for Parallelism K. Chesmi, S. Kamil, M. Strout, and M. Mehri Dehnavi SC18
Code generation for sparse matrix methods
Sympiler: Transforming Sparse Matrix Codes by Decoupling Symbolic Analysis K. Chesmi, S. Kamil, M. Strout, and M. Mehri Dehnavi SC17
Data Tranformations in Real-world code
Optimizing LOBPCG: Sparse Matrix Loop and Data Transformations in Action Khalid Ahmad, Anand Venkat, Mary W. Hall LCPC 2016
Shared-memory parallelization of MTTKRP for dense tensors. Koby Hayashi, Grey Ballard, Yujie Jiang, and Michael J. Tobia. PPoPP 2018
graph processing frameworks
Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable Laxman Dhulipala, Guy Blelloch, and Julian Shun SPAA 2018
graph processing frameworks
Julienne: A Framework for Parallel Graph Algorithms using Work-efficient Bucketing Laxman Dhulipala, Guy Blelloch, and Julian Shun SPAA 2017 Generalized CP streaming framework in ENSIGN Computationally Efficient CP Tensor Decomposition Update Framework for Emerging Component Discovery in Streaming Data Letourneau, P.D., Baskaran, M., Henretty, T., Ezick, J. and Lethin, R HPEC 2018
alternating least squares for tensor decomposition
Accelerating alternating least squares for tensor decomposition by pairwise perturbation Linjian Ma and Edgar Solomonik arxiv 2018
The Sparse Polyhedral Framework: Composing Compiler-Generated Inspector-Executor Code M. M. Strout, M. Hall and C. Olschanowsky Proc. of the IEEE 2018 10.1109/JPROC.2018.2857721
Loop upper bounds
Communication-Optimal Loops Nick Knight Phd Thesis 2015
Finding Good Block Sizes In Constant Time
A Fill Estimation Algorithm for Sparse Matrices and Tensors in Blocked Formats Peter Ahrens, Helen Xu, Nicholas Schiefer IPDPS 2018
CSF formalization
Tensor-Matrix Products with a Compressed Sparse Tensor. Shaden Smith, George Karypis IA^3 2015
TTMc optimization
Accelerating the Tucker Decomposition with Compressed Sparse Tensors Shaden Smith, George Karypis Euro-Par 2017
SPLATT - CSF data structure and MTTKRP
SPLATT: Efficient and parallel sparse tensor-matrix multiplication. Shaden Smith, Niranjay Ravindran, Nicholas D Sidiropoulos, and George Karypis IPDPS 2015
Generating code for computing with disparate tensor formats
Format Abstraction for Sparse Tensor Algebra Compilers Stephen Chou, Fredrik Kjolstad, and Saman Amarasinghe OOPSLA 2018
High-performance Sparse Graph Computations
GraphIt: a high-performance graph DSL "Yunming Zhang, Mengjiao Yang, Riyadh Baghdadi, Shoaib Kamil, Julian Shun, Saman Amarasinghe " OOPSLA 2018
Storage formats for tensor operations: Cloud
CSTF: Large-Scale Sparse Tensor Factorizations on Distributed Platforms Z. Blanco, B. Liu, and M. Mehri Dehnavi ICPP


Room 32-155
MIT Ray and Maria Stata Center
32 Vassar St
Cambridge, MA 02139