Current out-of-order logic is optimized for performance and does not take advantage of energy-saving techniques. Key out-of-order logic structures, including the register renaming logic and superscalar issue logic, will be examined for correlation between state and asserted control signals. Logical structures will also be modified to increase the correlation between these signals. By exploiting correlation, predictive or memoization techniques could be implemented in the issue logic to eliminate redundant work and increase energy efficiency. Time permitting, the benefit of these techniques will be analyzed.
As PDA technology becomes more prevalent, certain performance attributes become increasingly important and limiting. Especially important is a PDA s ability to perform a certain set of basic tasks at a high level. The other major consideration, due mainly to battery life limitations, is power consumption level. Our aim is to author a generic suite of benchmarks that we can apply across multiple current PDA architectures. These benchmarks would specifically target the areas of heightened concern we have already outlined. By analyzing the ability of various PDA architectures to perform the same set of tasks and analyzing the power consumption footprint, we can begin to draw conclusions as to weaknesses and or limitations in the current PDA technology.
Due to the constraints of VLSI scaling, future processor and system-on-chip designs will by necessity incorporate on-chip communication networks. In the project, we plan to investigate protocols and signalling technologies in the context of future on-chip multiprocessors in the 50nm regime. In this regime, interconnect delay becomes a major challenge and needs to be taken into account at all levels. This will require predicting scaling trends for devices and interconnect, and based on these models designing protocols and circuits. Our main design goal is minimum energy delay product.
The embedded processor market is increasing at a rapid rate. Some of the key goals for embedded processors are low cost, small area, and low power. As embedded systems become more complex, code size is increasing, causing instruction memory to occupy more area. Numerous research studies have been conducted in the field of code compression, with the primary goal of reducing the area required by memory. We intend to examine code compression from a low-power perspective. In our project, we will study existing compression algorithms, modify our compiler to generate compression-friendly code, and develop an efficient decoding scheme.
This work presents decoupled control flow, the next step in decoupled architectures which will enable processors of the future to reach new levels of performance. In a decoupled control/access/execute (CAE) machine, a control processor runs ahead and feeds directives to the memory access processor and the main execution processor; the directives are in the form of commands to execute basic blocks. The execution engine is then responsible for processing streams of valid instructions and data values, obtained without the overhead of speculation. This is a fundamental departure from the model in which an execution engine must actively fetch instructions and data values, or speculate to hide latency. As a result, new levels of performance are obtainable.