Effectiveness of Fault Detection Mechanisms in Static and Dynamic Operating System Designs

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

  • Martin Hoffmann
  • Christoph Borchert
  • Christian Dietrich
  • Horst Schirmeier
  • Rudiger Kapitza
  • Olaf Spinczyk
  • Daniel Lohmann

Externe Organisationen

  • Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU Erlangen-Nürnberg)
  • Technische Universität Dortmund
  • Technische Universität Braunschweig
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksIEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers Inc.
Seiten230-237
Seitenumfang8
ISBN (elektronisch)9781479944309
PublikationsstatusVeröffentlicht - 18 Sept. 2014
Extern publiziertJa
Veranstaltung17th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014 - Reno, USA / Vereinigte Staaten
Dauer: 10 Juni 201412 Juni 2014

Abstract

Developers of embedded (real-time) systems can choose from a variety of operating systems. While some embedded operating systems provide very flexible APIs, e.g., a POSIX-compliant interface for run-time management, others have a completely static structure, which is generated at compile time by utilizing detailed application knowledge. A prominent example for the latter class from the domain of automotive operating systems is OSEK/OS and its successor AUTOSAR/OS. As we have shown in previous work, the design of the operating system has a strong impact on its vulnerability for system failure caused by hardware faults. This observation is gaining importance, because there is an ongoing trend towards low-power and low-cost, yet less reliable, hardware. This work quantifies the difference in vulnerability for soft errors in main memory of a flexible (dynamic) operating systems (eCos) and a static system (CiAO), which has an OSEK-compliant structure. We also analyze the additional degree of robustness that is achieved by hardening an operating system with software-based and hardware-based fault-tolerance measures and the corresponding costs. Covering this design space gives developers a better chance for good design decisions with respect to the trade-off between fault tolerance, resource consumption, and interface convenience. Our results indicate that with a combination of hardware- and software-based fault-tolerance measures, silent data corruptions in both operating systems can be reduced to below one percent (compared to eCos). However, the analyzed fault-tolerance mechanisms are expensive for the dynamic system, whereas the statically designed operating system can be hardened at much lower price.

ASJC Scopus Sachgebiete

Zitieren

Effectiveness of Fault Detection Mechanisms in Static and Dynamic Operating System Designs. / Hoffmann, Martin; Borchert, Christoph; Dietrich, Christian et al.
IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014. Institute of Electrical and Electronics Engineers Inc., 2014. S. 230-237 6899154.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Hoffmann, M, Borchert, C, Dietrich, C, Schirmeier, H, Kapitza, R, Spinczyk, O & Lohmann, D 2014, Effectiveness of Fault Detection Mechanisms in Static and Dynamic Operating System Designs. in IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014., 6899154, Institute of Electrical and Electronics Engineers Inc., S. 230-237, 17th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014, Reno, USA / Vereinigte Staaten, 10 Juni 2014. https://doi.org/10.1109/ISORC.2014.26
Hoffmann, M., Borchert, C., Dietrich, C., Schirmeier, H., Kapitza, R., Spinczyk, O., & Lohmann, D. (2014). Effectiveness of Fault Detection Mechanisms in Static and Dynamic Operating System Designs. In IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014 (S. 230-237). Artikel 6899154 Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ISORC.2014.26
Hoffmann M, Borchert C, Dietrich C, Schirmeier H, Kapitza R, Spinczyk O et al. Effectiveness of Fault Detection Mechanisms in Static and Dynamic Operating System Designs. in IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014. Institute of Electrical and Electronics Engineers Inc. 2014. S. 230-237. 6899154 doi: 10.1109/ISORC.2014.26
Hoffmann, Martin ; Borchert, Christoph ; Dietrich, Christian et al. / Effectiveness of Fault Detection Mechanisms in Static and Dynamic Operating System Designs. IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014. Institute of Electrical and Electronics Engineers Inc., 2014. S. 230-237
Download
@inproceedings{c7aa9927a1f94ca4af790171cf730216,
title = "Effectiveness of Fault Detection Mechanisms in Static and Dynamic Operating System Designs",
abstract = "Developers of embedded (real-time) systems can choose from a variety of operating systems. While some embedded operating systems provide very flexible APIs, e.g., a POSIX-compliant interface for run-time management, others have a completely static structure, which is generated at compile time by utilizing detailed application knowledge. A prominent example for the latter class from the domain of automotive operating systems is OSEK/OS and its successor AUTOSAR/OS. As we have shown in previous work, the design of the operating system has a strong impact on its vulnerability for system failure caused by hardware faults. This observation is gaining importance, because there is an ongoing trend towards low-power and low-cost, yet less reliable, hardware. This work quantifies the difference in vulnerability for soft errors in main memory of a flexible (dynamic) operating systems (eCos) and a static system (CiAO), which has an OSEK-compliant structure. We also analyze the additional degree of robustness that is achieved by hardening an operating system with software-based and hardware-based fault-tolerance measures and the corresponding costs. Covering this design space gives developers a better chance for good design decisions with respect to the trade-off between fault tolerance, resource consumption, and interface convenience. Our results indicate that with a combination of hardware- and software-based fault-tolerance measures, silent data corruptions in both operating systems can be reduced to below one percent (compared to eCos). However, the analyzed fault-tolerance mechanisms are expensive for the dynamic system, whereas the statically designed operating system can be hardened at much lower price.",
keywords = "AUTOSAR, Dependability, ecos, Fault Injection, Operating System, OSEK, Real-time System, Reliability",
author = "Martin Hoffmann and Christoph Borchert and Christian Dietrich and Horst Schirmeier and Rudiger Kapitza and Olaf Spinczyk and Daniel Lohmann",
year = "2014",
month = sep,
day = "18",
doi = "10.1109/ISORC.2014.26",
language = "English",
pages = "230--237",
booktitle = "IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",
note = "17th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014 ; Conference date: 10-06-2014 Through 12-06-2014",

}

Download

TY - GEN

T1 - Effectiveness of Fault Detection Mechanisms in Static and Dynamic Operating System Designs

AU - Hoffmann, Martin

AU - Borchert, Christoph

AU - Dietrich, Christian

AU - Schirmeier, Horst

AU - Kapitza, Rudiger

AU - Spinczyk, Olaf

AU - Lohmann, Daniel

PY - 2014/9/18

Y1 - 2014/9/18

N2 - Developers of embedded (real-time) systems can choose from a variety of operating systems. While some embedded operating systems provide very flexible APIs, e.g., a POSIX-compliant interface for run-time management, others have a completely static structure, which is generated at compile time by utilizing detailed application knowledge. A prominent example for the latter class from the domain of automotive operating systems is OSEK/OS and its successor AUTOSAR/OS. As we have shown in previous work, the design of the operating system has a strong impact on its vulnerability for system failure caused by hardware faults. This observation is gaining importance, because there is an ongoing trend towards low-power and low-cost, yet less reliable, hardware. This work quantifies the difference in vulnerability for soft errors in main memory of a flexible (dynamic) operating systems (eCos) and a static system (CiAO), which has an OSEK-compliant structure. We also analyze the additional degree of robustness that is achieved by hardening an operating system with software-based and hardware-based fault-tolerance measures and the corresponding costs. Covering this design space gives developers a better chance for good design decisions with respect to the trade-off between fault tolerance, resource consumption, and interface convenience. Our results indicate that with a combination of hardware- and software-based fault-tolerance measures, silent data corruptions in both operating systems can be reduced to below one percent (compared to eCos). However, the analyzed fault-tolerance mechanisms are expensive for the dynamic system, whereas the statically designed operating system can be hardened at much lower price.

AB - Developers of embedded (real-time) systems can choose from a variety of operating systems. While some embedded operating systems provide very flexible APIs, e.g., a POSIX-compliant interface for run-time management, others have a completely static structure, which is generated at compile time by utilizing detailed application knowledge. A prominent example for the latter class from the domain of automotive operating systems is OSEK/OS and its successor AUTOSAR/OS. As we have shown in previous work, the design of the operating system has a strong impact on its vulnerability for system failure caused by hardware faults. This observation is gaining importance, because there is an ongoing trend towards low-power and low-cost, yet less reliable, hardware. This work quantifies the difference in vulnerability for soft errors in main memory of a flexible (dynamic) operating systems (eCos) and a static system (CiAO), which has an OSEK-compliant structure. We also analyze the additional degree of robustness that is achieved by hardening an operating system with software-based and hardware-based fault-tolerance measures and the corresponding costs. Covering this design space gives developers a better chance for good design decisions with respect to the trade-off between fault tolerance, resource consumption, and interface convenience. Our results indicate that with a combination of hardware- and software-based fault-tolerance measures, silent data corruptions in both operating systems can be reduced to below one percent (compared to eCos). However, the analyzed fault-tolerance mechanisms are expensive for the dynamic system, whereas the statically designed operating system can be hardened at much lower price.

KW - AUTOSAR

KW - Dependability

KW - ecos

KW - Fault Injection

KW - Operating System

KW - OSEK

KW - Real-time System

KW - Reliability

UR - http://www.scopus.com/inward/record.url?scp=84941286164&partnerID=8YFLogxK

U2 - 10.1109/ISORC.2014.26

DO - 10.1109/ISORC.2014.26

M3 - Conference contribution

AN - SCOPUS:84941286164

SP - 230

EP - 237

BT - IEEE 17th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 17th IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, ISORC 2014

Y2 - 10 June 2014 through 12 June 2014

ER -