Details
Original language | English |
---|---|
Title of host publication | 2019 IEEE 24th Pacific Rim International Symposium on Dependable Computing (PRDC) |
Publisher | IEEE Computer Society |
Pages | 138-147 |
Number of pages | 10 |
ISBN (electronic) | 978-1-7281-4961-5 |
ISBN (print) | 978-1-7281-4962-2 |
Publication status | Published - 2019 |
Event | 24th IEEE Pacific Rim International Symposium on Dependable Computing, PRDC 2019 - Kyoto, Japan Duration: 1 Dec 2019 → 3 Dec 2019 |
Publication series
Name | Proceedings IEEE Pacific Rim International Symposium on Dependable Computing |
---|---|
ISSN (Print) | 1555-094X |
ISSN (electronic) | 2473-3105 |
Abstract
Due to shrinking structure sizes and operating voltages, hardware becomes more susceptible to transient faults. Fault injection campaigns are a common approach to systematically assess the resilience of a system and the effectiveness of software-based counter measures. However, experimentally injecting all possible faults to achieve full fault-space coverage is infeasible in practice. While precise pruning techniques, such as def/use pruning, already provide a significant reduction of the campaign size, the number of injections remains still challenging for even medium-sized systems. We propose fault-space regions (FSRs) as a method to approximately cover the complete fault space with a significantly lower number of required injections. Instead of probabilistic subsampling of the fault space, our approximation exploits the actual program structure and execution trace (e.g., flow of basic blocks) to identify injection points that are representatives for a larger set of faults. We identify such data-flow regions and inject only data values that flow across region boundaries. Thereby, we can further reduce the number of injections by up to 76 percent, while the results divert only by less than 2.7 percent from those of a complete and precise fault-injection campaign. Furthermore, we keep the locality of the results regarding silent data corruptions to a deviation of less than 6.9 percent.
Keywords
- Bit flip, Fault injection, Fault space approximation, Functional correctness, Reliability, Single event upset
ASJC Scopus subject areas
- Computer Science(all)
- Computational Theory and Mathematics
- Computer Science(all)
- Computer Science Applications
- Computer Science(all)
- Hardware and Architecture
- Computer Science(all)
- Software
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
2019 IEEE 24th Pacific Rim International Symposium on Dependable Computing (PRDC). IEEE Computer Society, 2019. p. 138-147 ( Proceedings IEEE Pacific Rim International Symposium on Dependable Computing).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Program-Structure–Guided Approximation of Large Fault Spaces
AU - Pusz, Oskar
AU - Kiechle, Daniel
AU - Dietrich, Christian
AU - Lohmann, Daniel
PY - 2019
Y1 - 2019
N2 - Due to shrinking structure sizes and operating voltages, hardware becomes more susceptible to transient faults. Fault injection campaigns are a common approach to systematically assess the resilience of a system and the effectiveness of software-based counter measures. However, experimentally injecting all possible faults to achieve full fault-space coverage is infeasible in practice. While precise pruning techniques, such as def/use pruning, already provide a significant reduction of the campaign size, the number of injections remains still challenging for even medium-sized systems. We propose fault-space regions (FSRs) as a method to approximately cover the complete fault space with a significantly lower number of required injections. Instead of probabilistic subsampling of the fault space, our approximation exploits the actual program structure and execution trace (e.g., flow of basic blocks) to identify injection points that are representatives for a larger set of faults. We identify such data-flow regions and inject only data values that flow across region boundaries. Thereby, we can further reduce the number of injections by up to 76 percent, while the results divert only by less than 2.7 percent from those of a complete and precise fault-injection campaign. Furthermore, we keep the locality of the results regarding silent data corruptions to a deviation of less than 6.9 percent.
AB - Due to shrinking structure sizes and operating voltages, hardware becomes more susceptible to transient faults. Fault injection campaigns are a common approach to systematically assess the resilience of a system and the effectiveness of software-based counter measures. However, experimentally injecting all possible faults to achieve full fault-space coverage is infeasible in practice. While precise pruning techniques, such as def/use pruning, already provide a significant reduction of the campaign size, the number of injections remains still challenging for even medium-sized systems. We propose fault-space regions (FSRs) as a method to approximately cover the complete fault space with a significantly lower number of required injections. Instead of probabilistic subsampling of the fault space, our approximation exploits the actual program structure and execution trace (e.g., flow of basic blocks) to identify injection points that are representatives for a larger set of faults. We identify such data-flow regions and inject only data values that flow across region boundaries. Thereby, we can further reduce the number of injections by up to 76 percent, while the results divert only by less than 2.7 percent from those of a complete and precise fault-injection campaign. Furthermore, we keep the locality of the results regarding silent data corruptions to a deviation of less than 6.9 percent.
KW - Bit flip
KW - Fault injection
KW - Fault space approximation
KW - Functional correctness
KW - Reliability
KW - Single event upset
UR - http://www.scopus.com/inward/record.url?scp=85078458483&partnerID=8YFLogxK
U2 - 10.1109/PRDC47002.2019.00044
DO - 10.1109/PRDC47002.2019.00044
M3 - Conference contribution
AN - SCOPUS:85078458483
SN - 978-1-7281-4962-2
T3 - Proceedings IEEE Pacific Rim International Symposium on Dependable Computing
SP - 138
EP - 147
BT - 2019 IEEE 24th Pacific Rim International Symposium on Dependable Computing (PRDC)
PB - IEEE Computer Society
T2 - 24th IEEE Pacific Rim International Symposium on Dependable Computing, PRDC 2019
Y2 - 1 December 2019 through 3 December 2019
ER -