## Reconfigurable Multiprocessor with Self-optimizing Self-assembling and Self-restoring Micro-architecture ## P.W. Chun, V. Kirischian, S. Zhelnakov and L. Kirischian Embedded Reconfigurable Systems Lab (ERSL), Ryerson University, Toronto, Canada <a href="mailto:pchun@ee.ryerson.ca">pchun@ee.ryerson.ca</a> <a href="mailto:lkirisch@ee.ryerson.ca">lkirisch@ee.ryerson.ca</a> We present the major developments and implementations of a new class of high-performance parallel data-stream processing platforms with dynamically reconfigurable micro-architecture: Dynamically Reconfigurable Parallel Stream Processor (DRPSP) with task optimized Application Specific Virtual Processors (IP-cores). This multi-level computing architecture incorporates automated architectural synthesis system, embedded real-time hardware operating system and run-time reconfigurable computing platform based on partially reconfigurable FPGA. These three components allow: i) self-optimization of data-stream processing cores for task algorithm and data structure, ii) dynamic scheduling and binding system resources and iii) run-time self-assembling of task-optimized stream processing cores inside partially reconfigurable FPGA(s). The system also allows self-restoration by self-replication of faulty processing cores into a safe region if hardware fault(s) were caused by radiation effects (SEE: Single Event Effects) or wafer corruption. In implementation, the static routing structure called Virtual Bus is integrated in High-Level Synthesis stage to create the pseudo I/O interfaces for Virtual Hardware Components (VHCs). The VHCs are basic "LEGO"-type sub-cores that can be assembled together to create task-optimized processing cores – Application Specific Virtual Processors (ASVPs) using homogeneous FPGA resources. We present a systematic way of implementing ASVP micro-architecture in the DRPSP platform. The construction of service hardware including Hardware Operating System (HOS) enables run-time scheduling and binding FPGA resources for each of ASVP to be activated without interruption of data-streams processing running on other ASVPs in the rest of FPGA. With the run-time reconfiguration, we show that the configuration overhead is reduced by the factor of thousands (e.g. XCV2000E takes 24.3ms to configure the whole device while one frame unit can take $2.4\mu s$ ). However, the merits of configuration overhead can only be determined upon availability of the hardware configuration speed and the size of cores to be reconfigured. The layered ASVP idea is expanded onto the network level. The construction of the DRPSP network is proposed based on Xilinx Virtex2Pro family devices that are equipped with multi Giga-bit transceivers (Rocket I/O). The internally assembled Network Processor Units (also IP-core) based on multi Giga-bit transceivers provides a much higher bandwidth and a much greater flexibility compared to when the external NPU (ASIC) is interfaced in the board level. The concept of ASVP optimized for network data-stream processing and incorporated (on-Chip) with the NPU creates an Adaptive Protocol Network Processor (APNP) inside the DRPSP. Another aspect of DRPSP is ability for run-time restoration by self-replication of faulty VHC in the ASVP-core. The procedures for self-restoration with and without performance degradation were also implemented. To demonstrate the feasibility of DRPSP and test the above concepts the prototype of DRPSP was designed and implemented on Xilinx Virtex2Pro.