HARDWARE AND SOFTWARE TECHNIQUES FOR MULTICORE PROCESSOR ARCHITECTURES RELIABILITY ENHANCEMENT

Abstract: 

Multicore processor architectures consist the computing paradigm that is expected to keep Moore's law valid doubling the number of cores every 18-24 months. If this trend continues, hundred-core chips will provide unprecedented computing power and diverse functionality before the decade ends. Alas, massively parallel multicore processors built on scaled CMOS as well as beyond-CMOS nanoscale technologies will suffer from severe reliability problems because of:

  • hard/permanent errors that escaped manufacturing testing or appear in the field due to device aging, and
  • soft/transient errors due to radiation or package decay.

In an effort to salvage the huge investment behind a multicore processor chip, effective methodologies for hard and soft error detection, diagnosis, and repair/reconfiguration are among the top priorities for systems design.

The project will lead to a holistic methodology towards designing reliable multicore processor architectures spanning all different levels of abstraction: circuit-level, microarchitecture and architecture level, as well as system software and operating system level. Hardware-based, software-based and hybrid error detection, diagnosis, and repair/reconfiguration techniques will be proposed for the different components of the multicore architecture: processors, memories, interconnect.

The main drivers of the optimization decisions in our methodology will be the cost, power, and reliability requirements of the target application, so that a diverse set of solutions is provided for applications with different criticality and vulnerability.

The expected final project outcome is a fully autonomous fault-tolerant multicore architecture including self-testing, self-diagnosis, and self-configuration mechanisms. The long-term vision is building self-immune and self-adaptive artificial systems by transferring some of the self-organization properties of living systems to artificial systems aiming to enhance their dependability.

Project info

Acronym:
HOLISTIC
Scientific Coordinator:
Gizopoulos Dimitris
Research Team 2 Leader:
Nikolopoulos Dimitrios
Research Team 3 Leader:
Nikolos Dimitris
Research Team 4 Leader:
Pekmestzi Kiamal
Research Team 5 Leader:
Psarakis Mihalis

Stats

I.D.:
1103
Mis:
375270
Duration (months):
45
Budget:
600 000.00
Diavgeia:
ΑΔΑ: Β41Ρ9-Ξ2Ξ

Document Library

News