Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | The 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014 |
Herausgeber (Verlag) | Institute of Electrical and Electronics Engineers Inc. |
Seiten | 648-653 |
Seitenumfang | 6 |
ISBN (elektronisch) | 9781479922338 |
Publikationsstatus | Veröffentlicht - 22 Sept. 2014 |
Extern publiziert | Ja |
Veranstaltung | 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014 - Atlanta, USA / Vereinigte Staaten Dauer: 23 Juni 2014 → 26 Juni 2014 |
Abstract
State-machine replication has received widespread attention for the provisioning of highly available services in data centers. However, current production systems focus on tolerating crash faults only and prominent service outages caused by state corruptions have indicated that this is a risky strategy. In the future, state corruptions due to transient faults (such as bit flips) become even more likely, caused by ongoing hardware trends regarding the shrinking of structure sizes and reduction of operating voltages. In this paper we present Crosscheck, an approach to tolerate arbitrary state corruption (ASC) in the context of fault-tolerant replication of multithreaded services. Crosscheck is able to detect silent data corruptions ahead of execution, and by crosschecking state changes with co-executing replicas, even ASCs can be detected. Finally, fault tolerance is achieved by a fine-grained recovery using fault-free replicas. Our implementation is transparent to the application by utilizing fine-grained software-hardening mechanisms using aspect-oriented programming. To validate Crosscheck we present a replicated multithreaded key-value store that is resilient to state corruptions.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Computernetzwerke und -kommunikation
- Informatik (insg.)
- Hardware und Architektur
- Informatik (insg.)
- Software
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
The 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014. Institute of Electrical and Electronics Engineers Inc., 2014. S. 648-653 6903619.
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Crosscheck: Hardening replicated multithreaded services
AU - Martens, Arthur
AU - Borchert, Christoph
AU - Geissler, Tobias Oliver
AU - Lohmann, Daniel
AU - Spinczyk, Olaf
AU - Kapitza, Rudiger
PY - 2014/9/22
Y1 - 2014/9/22
N2 - State-machine replication has received widespread attention for the provisioning of highly available services in data centers. However, current production systems focus on tolerating crash faults only and prominent service outages caused by state corruptions have indicated that this is a risky strategy. In the future, state corruptions due to transient faults (such as bit flips) become even more likely, caused by ongoing hardware trends regarding the shrinking of structure sizes and reduction of operating voltages. In this paper we present Crosscheck, an approach to tolerate arbitrary state corruption (ASC) in the context of fault-tolerant replication of multithreaded services. Crosscheck is able to detect silent data corruptions ahead of execution, and by crosschecking state changes with co-executing replicas, even ASCs can be detected. Finally, fault tolerance is achieved by a fine-grained recovery using fault-free replicas. Our implementation is transparent to the application by utilizing fine-grained software-hardening mechanisms using aspect-oriented programming. To validate Crosscheck we present a replicated multithreaded key-value store that is resilient to state corruptions.
AB - State-machine replication has received widespread attention for the provisioning of highly available services in data centers. However, current production systems focus on tolerating crash faults only and prominent service outages caused by state corruptions have indicated that this is a risky strategy. In the future, state corruptions due to transient faults (such as bit flips) become even more likely, caused by ongoing hardware trends regarding the shrinking of structure sizes and reduction of operating voltages. In this paper we present Crosscheck, an approach to tolerate arbitrary state corruption (ASC) in the context of fault-tolerant replication of multithreaded services. Crosscheck is able to detect silent data corruptions ahead of execution, and by crosschecking state changes with co-executing replicas, even ASCs can be detected. Finally, fault tolerance is achieved by a fine-grained recovery using fault-free replicas. Our implementation is transparent to the application by utilizing fine-grained software-hardening mechanisms using aspect-oriented programming. To validate Crosscheck we present a replicated multithreaded key-value store that is resilient to state corruptions.
KW - AspectC++
KW - Determinism
KW - Multithreading
KW - Replication
KW - Software Error Hardening
UR - http://www.scopus.com/inward/record.url?scp=84937147584&partnerID=8YFLogxK
U2 - 10.1109/dsn.2014.98
DO - 10.1109/dsn.2014.98
M3 - Conference contribution
SP - 648
EP - 653
BT - The 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014
Y2 - 23 June 2014 through 26 June 2014
ER -