Checkpoint Placement for Systematic Fault-Injection Campaigns

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Authors

  • Christian Dietrich
  • Tim Marek Thomas
  • Matthias Mnich

External Research Organisations

  • Hamburg University of Technology (TUHH)
View graph of relations

Details

Original languageEnglish
Title of host publication2023 42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (electronic)979-8-3503-2225-5
ISBN (print)979-8-3503-2226-2
Publication statusPublished - 2023
Event42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023 - San Francisco, United States
Duration: 28 Oct 20232 Nov 2023

Publication series

NameIEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD
ISSN (Print)1092-3152

Abstract

Shrinking hardware structures and decreasing operating voltages lead to an increasing number of transient hardware faults, which thus become a core problem to consider for safety-critical systems. Here, systematic fault injection (FI), where one program-under-test is systematically stressed with faults, provides an in-depth resilience analysis in the presence of faults. However, FI campaigns require many independent injection experiments and, combined, long run times, especially if we aim for a high coverage of the fault space. One cost factor is the forwarding phase, which is the time required to bring the system-under test into the fault-free state at injection time. One common technique to speed up the forwarding are checkpoints of the fault-free system state at fixed points in time. In this paper, we show that the placement of checkpoints has a significant influence on the required forwarding cycles, especially if we place faults non-uniformly on the time axis. For this, we discuss the checkpoint-selection problem in general, formalize it as a maximum-weight reward path problem in graphs, propose an ILP formulation and a dynamic programming algorithm that find the optimal solution, and provide a heuristic checkpoint-selection method based on a genetic algorithm. Applied to the MiBench benchmark suite, our approach consistently reduces the forward-phase cycles by at least 88 percent and up to 99.934 percent when placing 16 checkpoints.

Keywords

    Checkpoint Placement, Fault Injection

ASJC Scopus subject areas

Cite this

Checkpoint Placement for Systematic Fault-Injection Campaigns. / Dietrich, Christian; Thomas, Tim Marek; Mnich, Matthias.
2023 42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2023. (IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD).

Research output: Chapter in book/report/conference proceedingConference contributionResearchpeer review

Dietrich, C, Thomas, TM & Mnich, M 2023, Checkpoint Placement for Systematic Fault-Injection Campaigns. in 2023 42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023 - Proceedings. IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD, Institute of Electrical and Electronics Engineers Inc., 42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023, San Francisco, United States, 28 Oct 2023. https://doi.org/10.48550/arXiv.2308.05521, https://doi.org/10.1109/ICCAD57390.2023.10323809
Dietrich, C., Thomas, T. M., & Mnich, M. (2023). Checkpoint Placement for Systematic Fault-Injection Campaigns. In 2023 42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023 - Proceedings (IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.48550/arXiv.2308.05521, https://doi.org/10.1109/ICCAD57390.2023.10323809
Dietrich C, Thomas TM, Mnich M. Checkpoint Placement for Systematic Fault-Injection Campaigns. In 2023 42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2023. (IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD). doi: 10.48550/arXiv.2308.05521, 10.1109/ICCAD57390.2023.10323809
Dietrich, Christian ; Thomas, Tim Marek ; Mnich, Matthias. / Checkpoint Placement for Systematic Fault-Injection Campaigns. 2023 42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2023. (IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD).
Download
@inproceedings{de46c016e83347749f604e31792bec6c,
title = "Checkpoint Placement for Systematic Fault-Injection Campaigns",
abstract = "Shrinking hardware structures and decreasing operating voltages lead to an increasing number of transient hardware faults, which thus become a core problem to consider for safety-critical systems. Here, systematic fault injection (FI), where one program-under-test is systematically stressed with faults, provides an in-depth resilience analysis in the presence of faults. However, FI campaigns require many independent injection experiments and, combined, long run times, especially if we aim for a high coverage of the fault space. One cost factor is the forwarding phase, which is the time required to bring the system-under test into the fault-free state at injection time. One common technique to speed up the forwarding are checkpoints of the fault-free system state at fixed points in time. In this paper, we show that the placement of checkpoints has a significant influence on the required forwarding cycles, especially if we place faults non-uniformly on the time axis. For this, we discuss the checkpoint-selection problem in general, formalize it as a maximum-weight reward path problem in graphs, propose an ILP formulation and a dynamic programming algorithm that find the optimal solution, and provide a heuristic checkpoint-selection method based on a genetic algorithm. Applied to the MiBench benchmark suite, our approach consistently reduces the forward-phase cycles by at least 88 percent and up to 99.934 percent when placing 16 checkpoints.",
keywords = "Checkpoint Placement, Fault Injection",
author = "Christian Dietrich and Thomas, {Tim Marek} and Matthias Mnich",
note = "Publisher Copyright: {\textcopyright} 2023 IEEE.; 42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023 ; Conference date: 28-10-2023 Through 02-11-2023",
year = "2023",
doi = "10.48550/arXiv.2308.05521",
language = "English",
isbn = "979-8-3503-2226-2",
series = "IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
booktitle = "2023 42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023 - Proceedings",
address = "United States",

}

Download

TY - GEN

T1 - Checkpoint Placement for Systematic Fault-Injection Campaigns

AU - Dietrich, Christian

AU - Thomas, Tim Marek

AU - Mnich, Matthias

N1 - Publisher Copyright: © 2023 IEEE.

PY - 2023

Y1 - 2023

N2 - Shrinking hardware structures and decreasing operating voltages lead to an increasing number of transient hardware faults, which thus become a core problem to consider for safety-critical systems. Here, systematic fault injection (FI), where one program-under-test is systematically stressed with faults, provides an in-depth resilience analysis in the presence of faults. However, FI campaigns require many independent injection experiments and, combined, long run times, especially if we aim for a high coverage of the fault space. One cost factor is the forwarding phase, which is the time required to bring the system-under test into the fault-free state at injection time. One common technique to speed up the forwarding are checkpoints of the fault-free system state at fixed points in time. In this paper, we show that the placement of checkpoints has a significant influence on the required forwarding cycles, especially if we place faults non-uniformly on the time axis. For this, we discuss the checkpoint-selection problem in general, formalize it as a maximum-weight reward path problem in graphs, propose an ILP formulation and a dynamic programming algorithm that find the optimal solution, and provide a heuristic checkpoint-selection method based on a genetic algorithm. Applied to the MiBench benchmark suite, our approach consistently reduces the forward-phase cycles by at least 88 percent and up to 99.934 percent when placing 16 checkpoints.

AB - Shrinking hardware structures and decreasing operating voltages lead to an increasing number of transient hardware faults, which thus become a core problem to consider for safety-critical systems. Here, systematic fault injection (FI), where one program-under-test is systematically stressed with faults, provides an in-depth resilience analysis in the presence of faults. However, FI campaigns require many independent injection experiments and, combined, long run times, especially if we aim for a high coverage of the fault space. One cost factor is the forwarding phase, which is the time required to bring the system-under test into the fault-free state at injection time. One common technique to speed up the forwarding are checkpoints of the fault-free system state at fixed points in time. In this paper, we show that the placement of checkpoints has a significant influence on the required forwarding cycles, especially if we place faults non-uniformly on the time axis. For this, we discuss the checkpoint-selection problem in general, formalize it as a maximum-weight reward path problem in graphs, propose an ILP formulation and a dynamic programming algorithm that find the optimal solution, and provide a heuristic checkpoint-selection method based on a genetic algorithm. Applied to the MiBench benchmark suite, our approach consistently reduces the forward-phase cycles by at least 88 percent and up to 99.934 percent when placing 16 checkpoints.

KW - Checkpoint Placement

KW - Fault Injection

UR - http://www.scopus.com/inward/record.url?scp=85181397790&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2308.05521

DO - 10.48550/arXiv.2308.05521

M3 - Conference contribution

AN - SCOPUS:85181397790

SN - 979-8-3503-2226-2

T3 - IEEE/ACM International Conference on Computer-Aided Design, Digest of Technical Papers, ICCAD

BT - 2023 42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 42nd IEEE/ACM International Conference on Computer-Aided Design, ICCAD 2023

Y2 - 28 October 2023 through 2 November 2023

ER -