On providing scalable self-healing adaptive fault-tolerance to RTR SoCs
2014 (English)In: Proceedings of ReConFigurable Computing and FPGAs (ReConFig), 2014 International Conference on, 2014, 1-6 p.Conference paper (Refereed)
The dependability of heterogeneous many-core FPGA based systems are threatened by higher failure rates caused by disruptive scales of integration, increased design complexity, and radiation sensitivity. Triple-modular redundancy (TMR) and run-time reconfiguration (RTR) are traditional fault-tolerant (FT) techniques used to increase dependability. However, hardware redundancy is expensive and most approaches have poor scalability, flexibility, and programmability. Therefore, innovative solutions are needed to reduce the redundancy cost but still preserve acceptable levels of dependability. In this context, this paper presents the implementation of a self-healing adaptive fault-tolerant SoC that reuses RTR IP-cores in order to self-assemble different TMR schemes during run-time. The presented system demonstrates the feasibility of the Upset-Fault-Observer concept, which provides a run-time self-test and recovery strategy that delivers fault-tolerance over functions accelerated in RTR cores, at the same time reducing the redundancy scalability cost by running periodic reconfigurable TMR scan-cycles. In addition, this paper experimentally evaluates the trade-off of the implemented reconfigurable TMR schemes by characterizing important fault tolerant metrics i.e., recovery time (self-repair and self-replicate), detection latency, self-assembly latency, throughput reduction, and increase of physical resources.
Place, publisher, year, edition, pages
2014. 1-6 p.
Fault tolerant systems, Hardware, Redundancy, Software, System-on-chip, Self-healing, Adaptive-computer-systems, FPGA, Partial and run-time reconfiguration, Space applications, Dependability
Other Electrical Engineering, Electronic Engineering, Information Engineering Embedded Systems Computer Systems
IdentifiersURN: urn:nbn:se:kth:diva-160878DOI: 10.1109/ReConFig.2014.7032541ScopusID: 2-s2.0-84946690245ISBN: 978-147995944-0OAI: oai:DiVA.org:kth-160878DiVA: diva2:791726
ReConFigurable Computing and FPGAs (ReConFig), 2014 International Conference on, Cancun, Mexico, 8-10 December 2014
QC 201504102015-03-022015-03-022015-12-01Bibliographically approved