QUICK START GUIDE

The StreamIt compiler is invoked via the strc script:

       strc foo.str
reads foo.str, produces foo.java as an intermediate file, compiles this down to a number of C++ files, and then compiles and links this to produce a binary, a.out.

The StreamIt Cookbook provides a step-by-step tutorial for getting started with the language and compiler. For reference, the command-line options to the compiler are also described below.


strc Command Line Options

--help
Displays a summary of common options.

--more-help
Displays a summary of advanced options (which are not described below).

--cluster <n>
Compile for a cluter or multicore with <n> nodes.

--library
Produce a Java file compatible with the StreamIt Java library, and compile and run it.

--simpleC
Generate a simple C file that inlines the entire application into a single function. This is sometimes more readable than the default uniprocessor output, but the backend is not fully-featured.

--raw <n>, -r <n>
Compile for an <n>-by-<n> Raw processor.

--rstream, -R
Generate a C-like file to be compiled by the RStream compiler from Reservoir Labs.

--output <filename>, -o <filename>
Places the resulting binary in <filename>.

--verbose
Show intermediate commands as they are executed.

Options available for all backends

-O0
Do not optimize (default).

-O1
Perform basic optimizations that should improve performance in most cases. Adds --unroll 16 --destroyfieldarray --partition --wbs.

-O2
Perform extended optimizations that should improve performance in most cases, but may also cause the compiler to become unstable. Adds --unroll 256 --destroyfieldarray --partition --wbs --macros.

--iterations <n>, -i<n>
Run the program for <n> steady-state iterations. Defaults to infinity. For the uniprocessor, cluster, and simpleC backends, the number of iterations can also be passed at the command line of the final executable (a.out -i 100).

--linearreplacement
Domain-specific optimization: combine adjacent ``linear'' filters in the program into a single matrix multiplication operation wherever possible. Corresponds to the ``linear'' option in the PLDI'03 paper.

--statespace
In combination with --linearreplacement, performs combination and optimization of linear statespace filters as described in the CASES'05 paper.

--unroll <n>, -u<n>
Specify loop unrolling limit. The default value is 0.

Options specific to Uniprocessor and Cluster backends

--cacheopt
Performs cache optimizations as described in the LCTES'05 paper.

--l1d <n>
Sets the L1 data cache size (in KB) for cache optimizations. The default is 8 KB.

--l1i <n>
Sets the L1 instruction cache size (in KB) for cache optimizations. The default is 8 KB.

--l2 <n>
Sets the L2 cache size (in KB) for cache optimizations (we assume a unified L2 cache). The default is 256 KB.

--linearpartition, -L
Domain-specific optimization: perform linear replacement and frequency replacement selectively, based on an estimate of where it is most beneficial. Corresponds to the ``autosel'' option in the PLDI'03 paper. (Relies on FFTW installation.)

Options specific to Raw backend

--asciifileio
Specifies that FileReader's and FileWriter's should use ASCII format rather than binary. Also works under the --simpleC backend.

--numbers <n>, -N<n>
Instrument code to gather performance statistics on simulated code over <n> steady-state cycles. The results are placed in results.out in the current directory.

--ssoutputs <n>
For applications containing a dynamic I/O rate, this option indicates how many outputs should count as a steady-state when gathering numbers (with --numbers).

--rawcol <m>, -c<m>
Specify number of columns in Raw processor; --raw specifies number of rows.

--wbs
When laying out communication instructions, use the work-based simulator to estimate exactly when items will be produced and consumed. This improves the scheduling of routing instructions.