TALK: Multi-Layer Dependability: From Micro-Architecture to Application Level

  • Speaker:

    Prof. Jörg Henkel
    Karlsruhe Institute of Technology (KIT), Germany

  • Time:

    June 1st - 5th, 2014

  • Location:

    DAC 2014 Special Session "Embedded Resiliency: Approaches for the Next Decade", San Francisco, USA

The dependability of a hardware-software system largely depends upon the interplay of various abstraction levels involved: if the software layers knew about the dependability enhancing techniques deployed at hardware layers and, the other way around, if the hardware layers knew about the techniques deployed at software layers, both sides could efficiently adapt by specifically targeting dependability gaps and without replicating efforts etc. This talk starts with techniques at the micro architectural level and it discusses how errors manifest according to spatial and temporal properties in the micro architecture and its environment, and it then explains the propagation to the (hardware dependent) software layer where errors manifest as bit flips. Reliability enhancing software transformations are presented and it is shown how they can decrease the negative impact an error would otherwise cause at the application software level. As one of the results, the talk presents vulnerability metrics that jointly cover hardware and software layers.

Special Session at DAC 2014 "Embedded Resiliency: Approaches for the Next Decade"
Organizer: Jörg Henkel

Dependability has become a major design constraint within this last decade. Whereas early approaches where highly specialized and mostly focused towards one abstraction layer of the design stack, it soon became apparent that cross-layer approaches are in many respects more advantageous since they involve more than a single layer and its specific means to increase dependability. When studying state-of-the art cross-layer dependability approaches, it can be stated that "cross-layer” in the vast majority of existing approaches denotes just two adjacent abstraction layers. This special session proposal goes a step further to "multi-layer" dependability approaches where in an ideal case major parts of the hardware layer AND the software layer are jointly considered for a comprehensive coverage of architectures and design methodologies to maximize the potential of a system' s dependability and resiliency at reasonable efforts and thus enabling scalability. This special session is initiated by two major international research consortiums, namely the NSF (USA) “Variability” Expedition and the DFG SPP1500 (Germany) “Dependable Embedded Systems” in conjunction with industrial support. The structure of this special session is bottom down starting with an industrial perspective/constraints on embedded processors’ reliability by Vikas Chandra and insights on the wear-out problem and in-situ monitors. It is followed by the talk of Jörg Henkel covering multiple abstraction layers from software down to micro-architectural level closing the gap between hardware and software resiliency techniques that have traditionally been treated separately. The importance of the reliability of the memory sub-system is presented at various abstraction levels by Nikil Dutt incl. OS-level exploitation of DRAM power variation to save energy and incorporating heterogeneous memory organizations to increase reliability through exploitation of application semantics across multiple abstraction levels, incl. applications, compilers, run-time systems. Finally, the talk by Ulf Schlichtmann bridges the gap to gate and circuit level and how to implement technology abstraction in order to ensure a necessary degree of technology independence. In summary, the whole special session covers all abstraction layers of an embedded system and each of the four talks emphasizes the interaction of multiple layers and how to adapt and interface for providing a high degree of efficiency and scalability for future resilient embedded systems.

Topics in this session are: “Monitoring Reliability in Embedded Processors – A Multi-layer View”, “Multi-Layer Dependability: From Microarchitecture to Application Level”, “Multi-Layer Memory Resiliency”, and “Workload- and Instruction-Aware Timing Analysis – The missing Link between Technology and System-level Resilience”.