|
Please follow the following guidelines for reporting results.
Benchmarks may be compiled with the
best available compiler.
Benchmarks may be rewritten in any language (e.g., C, Java,
StreamIt, Brook,
Verilog) provided the new code adheres to the original
algorithms. For example, a recoding of bmm must
use the blocked matrix multiply algorithm. The benchmarks
may even be hand-coded in assembly to suit a particular
architecture.
The
Versatility may be computed using wall clock times
(preferred) or number of cycles, as long as the method used
is clearly specified. If using a cycle counter, you will find
two timing markers - /*** VERSABENCH START ***/ and
that respectively indicate where
cycle-counting should begin and end.
In the case of the SERVER benchmarks, report the total time to run
twenty-four (24) instances of the benchmark.
Real architectures are preferred, but simulators may be
used.
Although the
modeling of real I/O is encouraged, we recognize the
difficulty in doing so in a prototype environment. We
suggest initializing a region of external DRAM with I/O
data, and flushing caches so they are not primed prior to
the measurement process. Simulation environments often
ignore system calls, in that they are treated as magical
instructions that can atomically update memory, without
polluting the caches. Alternatively, a deionizer
may be used to idealize I/O. We went to great lengths to
minimize the effects of I/O in the VersaBench suite.
For some benchmarks, we provide
multiple inputs, please use the one designated as the
reference input (ref) for timing
measurements.
The evaluation process affords a lot of flexibility in how the
benchmarks may be coded and executed. However, when reporting
results, the details of the methodology that is adopted
must be clearly described. Some common parameters
(following the guidelines above) include:
whether a simulator is used
the language and compiler used in the
implementation (and if any hand-coding is done),
whether wall clock times are used, or whether cycles are
being measured,
the clock speeds that are assumed for the
architecture,
whether I/O is accurately simulated, or if
the I/O costs are ignored,
and the speeds assumed for caches and external memory, and whether the external memory is
faithfully modeled.
|