Crosscheck: Hardening replicated multithreaded services

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autorschaft

  • Arthur Martens
  • Christoph Borchert
  • Tobias Oliver Geissler
  • Daniel Lohmann
  • Olaf Spinczyk
  • Rudiger Kapitza

Externe Organisationen

  • Technische Universität Braunschweig
  • Technische Universität Dortmund
  • Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU Erlangen-Nürnberg)
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksThe 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers Inc.
Seiten648-653
Seitenumfang6
ISBN (elektronisch)9781479922338
PublikationsstatusVeröffentlicht - 22 Sept. 2014
Extern publiziertJa
Veranstaltung44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014 - Atlanta, USA / Vereinigte Staaten
Dauer: 23 Juni 201426 Juni 2014

Abstract

State-machine replication has received widespread attention for the provisioning of highly available services in data centers. However, current production systems focus on tolerating crash faults only and prominent service outages caused by state corruptions have indicated that this is a risky strategy. In the future, state corruptions due to transient faults (such as bit flips) become even more likely, caused by ongoing hardware trends regarding the shrinking of structure sizes and reduction of operating voltages. In this paper we present Crosscheck, an approach to tolerate arbitrary state corruption (ASC) in the context of fault-tolerant replication of multithreaded services. Crosscheck is able to detect silent data corruptions ahead of execution, and by crosschecking state changes with co-executing replicas, even ASCs can be detected. Finally, fault tolerance is achieved by a fine-grained recovery using fault-free replicas. Our implementation is transparent to the application by utilizing fine-grained software-hardening mechanisms using aspect-oriented programming. To validate Crosscheck we present a replicated multithreaded key-value store that is resilient to state corruptions.

ASJC Scopus Sachgebiete

Zitieren

Crosscheck: Hardening replicated multithreaded services. / Martens, Arthur; Borchert, Christoph; Geissler, Tobias Oliver et al.
The 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014. Institute of Electrical and Electronics Engineers Inc., 2014. S. 648-653 6903619.

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Martens, A, Borchert, C, Geissler, TO, Lohmann, D, Spinczyk, O & Kapitza, R 2014, Crosscheck: Hardening replicated multithreaded services. in The 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014., 6903619, Institute of Electrical and Electronics Engineers Inc., S. 648-653, 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014, Atlanta, USA / Vereinigte Staaten, 23 Juni 2014. https://doi.org/10.1109/dsn.2014.98
Martens, A., Borchert, C., Geissler, T. O., Lohmann, D., Spinczyk, O., & Kapitza, R. (2014). Crosscheck: Hardening replicated multithreaded services. In The 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014 (S. 648-653). Artikel 6903619 Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/dsn.2014.98
Martens A, Borchert C, Geissler TO, Lohmann D, Spinczyk O, Kapitza R. Crosscheck: Hardening replicated multithreaded services. in The 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014. Institute of Electrical and Electronics Engineers Inc. 2014. S. 648-653. 6903619 doi: 10.1109/dsn.2014.98
Martens, Arthur ; Borchert, Christoph ; Geissler, Tobias Oliver et al. / Crosscheck: Hardening replicated multithreaded services. The 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014. Institute of Electrical and Electronics Engineers Inc., 2014. S. 648-653
Download
@inproceedings{d8b68d615faf438f9802112abbd9a8b1,
title = "Crosscheck: Hardening replicated multithreaded services",
abstract = "State-machine replication has received widespread attention for the provisioning of highly available services in data centers. However, current production systems focus on tolerating crash faults only and prominent service outages caused by state corruptions have indicated that this is a risky strategy. In the future, state corruptions due to transient faults (such as bit flips) become even more likely, caused by ongoing hardware trends regarding the shrinking of structure sizes and reduction of operating voltages. In this paper we present Crosscheck, an approach to tolerate arbitrary state corruption (ASC) in the context of fault-tolerant replication of multithreaded services. Crosscheck is able to detect silent data corruptions ahead of execution, and by crosschecking state changes with co-executing replicas, even ASCs can be detected. Finally, fault tolerance is achieved by a fine-grained recovery using fault-free replicas. Our implementation is transparent to the application by utilizing fine-grained software-hardening mechanisms using aspect-oriented programming. To validate Crosscheck we present a replicated multithreaded key-value store that is resilient to state corruptions.",
keywords = "AspectC++, Determinism, Multithreading, Replication, Software Error Hardening",
author = "Arthur Martens and Christoph Borchert and Geissler, {Tobias Oliver} and Daniel Lohmann and Olaf Spinczyk and Rudiger Kapitza",
year = "2014",
month = sep,
day = "22",
doi = "10.1109/dsn.2014.98",
language = "English",
pages = "648--653",
booktitle = "The 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",
note = "44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014 ; Conference date: 23-06-2014 Through 26-06-2014",

}

Download

TY - GEN

T1 - Crosscheck: Hardening replicated multithreaded services

AU - Martens, Arthur

AU - Borchert, Christoph

AU - Geissler, Tobias Oliver

AU - Lohmann, Daniel

AU - Spinczyk, Olaf

AU - Kapitza, Rudiger

PY - 2014/9/22

Y1 - 2014/9/22

N2 - State-machine replication has received widespread attention for the provisioning of highly available services in data centers. However, current production systems focus on tolerating crash faults only and prominent service outages caused by state corruptions have indicated that this is a risky strategy. In the future, state corruptions due to transient faults (such as bit flips) become even more likely, caused by ongoing hardware trends regarding the shrinking of structure sizes and reduction of operating voltages. In this paper we present Crosscheck, an approach to tolerate arbitrary state corruption (ASC) in the context of fault-tolerant replication of multithreaded services. Crosscheck is able to detect silent data corruptions ahead of execution, and by crosschecking state changes with co-executing replicas, even ASCs can be detected. Finally, fault tolerance is achieved by a fine-grained recovery using fault-free replicas. Our implementation is transparent to the application by utilizing fine-grained software-hardening mechanisms using aspect-oriented programming. To validate Crosscheck we present a replicated multithreaded key-value store that is resilient to state corruptions.

AB - State-machine replication has received widespread attention for the provisioning of highly available services in data centers. However, current production systems focus on tolerating crash faults only and prominent service outages caused by state corruptions have indicated that this is a risky strategy. In the future, state corruptions due to transient faults (such as bit flips) become even more likely, caused by ongoing hardware trends regarding the shrinking of structure sizes and reduction of operating voltages. In this paper we present Crosscheck, an approach to tolerate arbitrary state corruption (ASC) in the context of fault-tolerant replication of multithreaded services. Crosscheck is able to detect silent data corruptions ahead of execution, and by crosschecking state changes with co-executing replicas, even ASCs can be detected. Finally, fault tolerance is achieved by a fine-grained recovery using fault-free replicas. Our implementation is transparent to the application by utilizing fine-grained software-hardening mechanisms using aspect-oriented programming. To validate Crosscheck we present a replicated multithreaded key-value store that is resilient to state corruptions.

KW - AspectC++

KW - Determinism

KW - Multithreading

KW - Replication

KW - Software Error Hardening

UR - http://www.scopus.com/inward/record.url?scp=84937147584&partnerID=8YFLogxK

U2 - 10.1109/dsn.2014.98

DO - 10.1109/dsn.2014.98

M3 - Conference contribution

SP - 648

EP - 653

BT - The 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014

Y2 - 23 June 2014 through 26 June 2014

ER -