Benchmark | Description | lines of code | # of constructs in the program | Number of filters in the expanded graph | |||
---|---|---|---|---|---|---|---|
filters | pipelines | splitjoins | feedbackloops | ||||
FIR | 64 tap FIR | 125 | 5 | 1 | 0 | 0 | 132 |
Benchmark | 250 MHz RAW processor | C on a 2.2 GHz Intel Pentium IV | ||||
---|---|---|---|---|---|---|
StreamIt on 16 tiles | C on a single tile | |||||
Utilization | # of tiles used | MFLOPS | Throughput (per 105 cycles) | Throughput (per 105 cycles) | Throughput (per 105 cycles) | |
FIR | 86% | 14 | 815 | 1188.1 | 293.5 | 445.6 |
Utilization numbers reported were 6606 useful cycles / 7712 total cycles = 0.85658714 .
flops reported by RAW's cycle-accurate simulator are 1728 first/1565 second flops(avg=1646.5), which is (1646.5 flops/505 cycles) * 250 million cylces/second = 815.0990 MFLOPS.
The first iteration done at 0x18f2b (102187 cycles)
The second iteration done at 0x1910d (102669 cycles) (delta=482)
The third iteration done at 0x1931d (103197 cycles) (delta=528)
Based on these cycle counts, each iteration takes 482 first/528 second
cycles (average = 505).
6 outputs every 505 cycles, normalized to 10^5 cycles results in a throughput of 6*(100000/505) = 1188.1188 outputs every 10^5 cycles.
6 outputs every 2726 cycles normalized to 10^5 cycles, 6*(100000/2726) = 220.103 outputs / 10^5 cycles.
flops reported are 516 flops, which is (516 flops/2726 cycles) * 250 million cycles/second = 47.322 MFLOPS.
Utilization numbers reported were 43359 useful cycles/ 43616 total cycles = 0.99410767
Number of cycles per iteration: 10^8 iterations/ 10.20 second * 1 outputs / 1 iteration * 1 second / 2.2*10^9 cycles * 10^5 cycles = 445.6328 outputs / 10^5 cycles.