Details
Original language | English |
---|---|
Title of host publication | Embedded Computer Systems |
Subtitle of host publication | Architectures, Modeling, and Simulation - 23rd International Conference, SAMOS 2023, Proceedings |
Editors | Cristina Silvano, Marc Reichenbach, Christian Pilato |
Publisher | Springer International Publishing AG |
Pages | 19-32 |
Number of pages | 14 |
ISBN (electronic) | 978-3-031-46077-7 |
ISBN (print) | 978-3-031-46076-0 |
Publication status | Published - 2023 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 14385 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (electronic) | 1611-3349 |
Abstract
Field-programmable gate array (FPGAs) in space applications come with the drawback of radiation effects, which inevitably will occur in devices of small process size. This also applies to the electronics of the Bose Einstein Condensate and Cold Atom Laboratory (BECCAL) apparatus, which will operate on the International Space Station (ISS) for several years. A total of more than 100 FPGAs distributed throughout the setup will be used for high-precision control of specialized sensors and actuators at nanosecond scale. On ISS, radiation effects must be taken into account, the functionality of the electronics must be monitored, and errors must be handled properly. Due to the large number of devices in BECCAL, commercial off-the-shelf (COTS) FPGAs are used, which are not radiation hardened. This paper describes the methods and measures used to mitigate the effects of radiation in an application specific COTS-FPGA-based communication network. Based on the firmware for a central communication network switch in BECCAL the steps are described to integrate redundancy into the design while optimizing the firmware to stay within the FPGA’s resource constraints. A redundant integrity checker module is developed that can notify preceding network devices of data and configuration bit errors. The firmware is validated and evaluated by injecting faults into data and configuration registers in simulation and real hardware. In the end, the FPGA resource usage of the firmware is reduced by more than half, enabling the use of dual modular redundancy (DMR) for the switching fabric. Together with the triple modular redundancy (TMR) protected integrity checker, this combination completely prevents silent data corruptions in the design as shown in simulation and by injecting faults into hardware using the Intel Fault Injection FPGA IP Core while staying within the resource limitation of a COTS FPGA.
Keywords
- commercial off-the-shelf, fault detection, field-programmable gate array, space application
ASJC Scopus subject areas
- Mathematics(all)
- Theoretical Computer Science
- Computer Science(all)
- General Computer Science
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
Embedded Computer Systems: Architectures, Modeling, and Simulation - 23rd International Conference, SAMOS 2023, Proceedings. ed. / Cristina Silvano; Marc Reichenbach; Christian Pilato. Springer International Publishing AG, 2023. p. 19-32 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 14385 LNCS).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Fault Detection Mechanisms for COTS FPGA Systems Used in Low Earth Orbit
AU - Oberschulte, Tim
AU - Marten, Jakob
AU - Blume, Holger
PY - 2023
Y1 - 2023
N2 - Field-programmable gate array (FPGAs) in space applications come with the drawback of radiation effects, which inevitably will occur in devices of small process size. This also applies to the electronics of the Bose Einstein Condensate and Cold Atom Laboratory (BECCAL) apparatus, which will operate on the International Space Station (ISS) for several years. A total of more than 100 FPGAs distributed throughout the setup will be used for high-precision control of specialized sensors and actuators at nanosecond scale. On ISS, radiation effects must be taken into account, the functionality of the electronics must be monitored, and errors must be handled properly. Due to the large number of devices in BECCAL, commercial off-the-shelf (COTS) FPGAs are used, which are not radiation hardened. This paper describes the methods and measures used to mitigate the effects of radiation in an application specific COTS-FPGA-based communication network. Based on the firmware for a central communication network switch in BECCAL the steps are described to integrate redundancy into the design while optimizing the firmware to stay within the FPGA’s resource constraints. A redundant integrity checker module is developed that can notify preceding network devices of data and configuration bit errors. The firmware is validated and evaluated by injecting faults into data and configuration registers in simulation and real hardware. In the end, the FPGA resource usage of the firmware is reduced by more than half, enabling the use of dual modular redundancy (DMR) for the switching fabric. Together with the triple modular redundancy (TMR) protected integrity checker, this combination completely prevents silent data corruptions in the design as shown in simulation and by injecting faults into hardware using the Intel Fault Injection FPGA IP Core while staying within the resource limitation of a COTS FPGA.
AB - Field-programmable gate array (FPGAs) in space applications come with the drawback of radiation effects, which inevitably will occur in devices of small process size. This also applies to the electronics of the Bose Einstein Condensate and Cold Atom Laboratory (BECCAL) apparatus, which will operate on the International Space Station (ISS) for several years. A total of more than 100 FPGAs distributed throughout the setup will be used for high-precision control of specialized sensors and actuators at nanosecond scale. On ISS, radiation effects must be taken into account, the functionality of the electronics must be monitored, and errors must be handled properly. Due to the large number of devices in BECCAL, commercial off-the-shelf (COTS) FPGAs are used, which are not radiation hardened. This paper describes the methods and measures used to mitigate the effects of radiation in an application specific COTS-FPGA-based communication network. Based on the firmware for a central communication network switch in BECCAL the steps are described to integrate redundancy into the design while optimizing the firmware to stay within the FPGA’s resource constraints. A redundant integrity checker module is developed that can notify preceding network devices of data and configuration bit errors. The firmware is validated and evaluated by injecting faults into data and configuration registers in simulation and real hardware. In the end, the FPGA resource usage of the firmware is reduced by more than half, enabling the use of dual modular redundancy (DMR) for the switching fabric. Together with the triple modular redundancy (TMR) protected integrity checker, this combination completely prevents silent data corruptions in the design as shown in simulation and by injecting faults into hardware using the Intel Fault Injection FPGA IP Core while staying within the resource limitation of a COTS FPGA.
KW - commercial off-the-shelf
KW - fault detection
KW - field-programmable gate array
KW - space application
UR - http://www.scopus.com/inward/record.url?scp=85187706278&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-46077-7_2
DO - 10.1007/978-3-031-46077-7_2
M3 - Conference contribution
SN - 978-3-031-46076-0
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 19
EP - 32
BT - Embedded Computer Systems
A2 - Silvano, Cristina
A2 - Reichenbach, Marc
A2 - Pilato, Christian
PB - Springer International Publishing AG
ER -