Built-in Self-Test/Repair of a Xentium Based Many-Core Architecture

Type: Master's Assignment
Contacts: Xiao ZhangHans Kerkhoff
Project: CRISP
Location: CAES, University of Twente

Introduction

Without the yield problem of the cell processor of PS3 in its early days or the “three red light” (ring of death) problem of XBOX 360, the new generation of gaming consoles could have arrived upon us earlier and served us better no matter you are a fan of Microsoft or Sony.

Both the Cell processor of PS3 and Xenon processor of XBOX 360 were designed based on the latest multi-core architecture and the problems they suffered invoked the following thinking: is it possible to keep a multi-core processor still running when one or more cores have become faulty or even physically defected due to various reasons?

This question is one of the questions to be answered in the CRISP (Cutting-Edge Reconfigurable ICs for Stream Processing) project. The CRISP project aims to explore the optimal utilization, efficient programming and dependability of the reconfigurable many-core architecture. The basic “core” in this architecture is the “Xentium” tile processor from Recore Systems. One potential application will use an SoC with 54 Xentium tile processors together as the powerhouse for stream signal processing. It is of key importance to ensure the correctness of the SoC because of the special tasks it needs to perform.

In the dependability theme of the CRISP project, we propose to include a dedicated infrastructural IP into the system to monitor the “correctness” of each individual tile processor at run-time (Built-in Self-Test). When a “core” is tested and found faulty, it will be marked as an unusable resource and excluded from the system thanks to the run-time mapping software (Built-in Self-Repair). The Quality of Service (QoS) of the system will go down inevitably due to the lose of computing resource but the system can still be considered as fault-free before the system QoS drops below the the threshold.

The infrastructural IP will function as the Dependability Manager in the system. It will coordinate the activity of sub components such as the test pattern generator, test response evaluator, network-on-chip interfaces and IEEE Std. 1500 compatible wrapper cells to carry out the run-time test task successfully.

Assignment

You will need to design the central part of the dependability manager, a complex finite state machine, to coordinate the activities of sub components (already available). You will also need to develop the necessary protocols to ensure a fluent communication between these functional blocks.

The following research questions will be raised for this assignment:

  • How to make an efficient FSM with a complete set of instructions to perform the necessary tasks?
  • How to use parallelism to maximize the test pattern input rate by using the parallel input port of the IEEE Std. 1500 wrapper?

You should be familiar with the VHDL language in order to make the design and will have the opportunity to synthesize and implement the design onto a sophisticated FPGA platform to make your design live.

Free Joomla Templates designed by Web Hosting Top