The Pochoir Project

(Difference between revisions)
 Revision as of 23:25, 14 February 2011 (view source)Yuantang (Talk | contribs) (→Performance Peek)← Older edit Revision as of 23:26, 14 February 2011 (view source)Yuantang (Talk | contribs) (→Performance Peek)Newer edit → Line 15: Line 15: ==
Performance Peek == ==
Performance Peek == -
[[File:dfd_logy_hotPar_grid.png|frame|left|(a) A 3D wave equation with a non-periodic boundary condition executing for 1000 time steps.]] +
[[File:dfd_logy_hotPar_grid.png|frame|left|(a) A 3D wave equation with a non-periodic boundary condition executing for 1000 time steps.]] [[File:heat_2D_P_logy_hotPar_grid.png|frame|right|(b)A 2D heat equation on a torus executing for 3200 time steps. ]] - [[File:heat_2D_P_logy_hotPar_grid.png|frame|right|(b)A 2D heat equation on a torus executing for 3200 time steps. ]] +
[[File:psa_logy_hotPar_grid.png|frame|left|(c)1D pairwise sequence alignment (no time step).]] [[File:lbm_logy_hotPar_grid.png|frame|right|(d)A Lattice Boltzmann method with a non-periodic boundary condition for 3000 time steps.]] - + -
[[File:psa_logy_hotPar_grid.png|frame|left|(c)1D pairwise sequence alignment (no time step).]] + - + - [[File:lbm_logy_hotPar_grid.png|frame|right|(d)A Lattice Boltzmann method with a non-periodic boundary condition for 3000 time steps.]] + Comparing the number of grid points processed per second (semilogarithmic scale) for Pochoir-generated code on 12 cores versus serial- and parallel-loop implementations. In all figures, the top curve is for the Pochoir-generated code, the middle curve is for parallel loops, and the bottom curve is for serial loops. (a) A 3D wave equation with a non-periodic boundary condition executing for 1000 time steps. (b) A 2D heat equation on a torus executing for 3200 time steps. (c) 1D pairwise sequence alignment (no time step). (d) A Lattice Boltzmann method with a non-periodic boundary condition for 3000 time steps. Comparing the number of grid points processed per second (semilogarithmic scale) for Pochoir-generated code on 12 cores versus serial- and parallel-loop implementations. In all figures, the top curve is for the Pochoir-generated code, the middle curve is for parallel loops, and the bottom curve is for serial loops. (a) A 3D wave equation with a non-periodic boundary condition executing for 1000 time steps. (b) A 2D heat equation on a torus executing for 3200 time steps. (c) 1D pairwise sequence alignment (no time step). (d) A Lattice Boltzmann method with a non-periodic boundary condition for 3000 time steps.

Pochoir - Parallel Stencil Computation Compiler

Pochoir (pronounced "PO-shwar") is a compiler and run-time system for implementing stencil computations on multicore processors. A stencil defines the value of a grid point in a d-dimensional spatial grid at time t as a function of neighboring grid points at recent times before t. A stencil computation computes the stencil for each grid point over many time steps.

In Pochoir, user typically just need to specify his /or her stencil computing kernel and boundary conditions in an embedded domain specific language in C++. Depending on the purpose of checking functional correctness or performance, user can employ either a native C++ compiler or Pochoir compiler to compile and run his /or her code. If the user employs the Pochoir compiler, the basic parallelization and optimization strategy of Pochoir is divide-and-conquer (cache-oblivious algorithm). In higher dimensional space-time grid, Pochoir employs a novel cutting strategy of simultaneous space cut.

Pochoir is an open source software project hosted by SuperTech group at CSAIL, MIT. You are invited to contribute in many forms (documentation, translation, writing code, fixing bugs, porting to other platforms...).

The Pochoir package contains three (3) main components: an embedded domain specific language (EDSL) in native C++ for stencil, a C++ template library for baseline run, and a domain specific compiler in Haskell for optimal run.

Currently, the Pochoir package is only tested on Linux system.

Performance Peek

(a) A 3D wave equation with a non-periodic boundary condition executing for 1000 time steps.
(b)A 2D heat equation on a torus executing for 3200 time steps.

(c)1D pairwise sequence alignment (no time step).
(d)A Lattice Boltzmann method with a non-periodic boundary condition for 3000 time steps.

Comparing the number of grid points processed per second (semilogarithmic scale) for Pochoir-generated code on 12 cores versus serial- and parallel-loop implementations. In all figures, the top curve is for the Pochoir-generated code, the middle curve is for parallel loops, and the bottom curve is for serial loops. (a) A 3D wave equation with a non-periodic boundary condition executing for 1000 time steps. (b) A 2D heat equation on a torus executing for 3200 time steps. (c) 1D pairwise sequence alignment (no time step). (d) A Lattice Boltzmann method with a non-periodic boundary condition for 3000 time steps.

Pochoir Team

The project is jointly developed by SuperTech group at MIT and Intel Corp.. The core team includes:

Current Release

Pochoir 1.0 is now released (Feb, 2011)

Pochoir is covered by the GNU Public License version 3.0

Acknowledgment

The Pochoir project is supported in part by a grant from Intel Corporation and in part by the NSF Grant CCF-0937860 and CNS-1017058.