Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Proceedings - 2024 IEEE 35th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2024 |
Seiten | 28-29 |
Seitenumfang | 2 |
ISBN (elektronisch) | 979-8-3503-4963-4 |
Publikationsstatus | Veröffentlicht - 2024 |
Publikationsreihe
Name | IEEE International Conference on Application-Specific Systems, Architectures, and Processors |
---|---|
ISSN (Print) | 2160-0511 |
ISSN (elektronisch) | 2160-052X |
Abstract
The growing use of LiDAR systems and constrained computing resources in the automotive sector require efficient LiDAR processing. SalsaNext, a convolutional neural network for semantic segmentation, is a promising candidate for deployment in that area. To extend the research regarding its quantization and investigate its adaptability to constrained resources, a design space exploration is performed. The design space, defined by model size, topology, and compute precision, is evaluated on a Jetson AGX Orin regarding classification accuracy, latency, and energy efficiency. The results display a trade-off between classification accuracy and runtime. The smallest model evaluated in INT8 on the GPU provides the smallest latency of 14.48 ms with a mloU score of 43.2%. A mloU score of 47.7% at a latency of 26.92 ms can be achieved with the medium-sized model and modified topology evaluated in INT8 on the DLA. The medium-sized model with modified topology provides good classification accuracy evaluated in FP32 on the GPU with a mloU score of 55.2% in 67.85 ms.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Hardware und Architektur
- Informatik (insg.)
- Computernetzwerke und -kommunikation
Ziele für nachhaltige Entwicklung
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Proceedings - 2024 IEEE 35th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2024. 2024. S. 28-29 (IEEE International Conference on Application-Specific Systems, Architectures, and Processors).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Design Space Exploration of Semantic Segmentation CNN SalsaNext for Constrained Architectures
AU - Renke, Oliver
AU - Riggers, Christoph
AU - Karrenbauer, Jens
AU - Blume, Holger
N1 - Publisher Copyright: © 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - The growing use of LiDAR systems and constrained computing resources in the automotive sector require efficient LiDAR processing. SalsaNext, a convolutional neural network for semantic segmentation, is a promising candidate for deployment in that area. To extend the research regarding its quantization and investigate its adaptability to constrained resources, a design space exploration is performed. The design space, defined by model size, topology, and compute precision, is evaluated on a Jetson AGX Orin regarding classification accuracy, latency, and energy efficiency. The results display a trade-off between classification accuracy and runtime. The smallest model evaluated in INT8 on the GPU provides the smallest latency of 14.48 ms with a mloU score of 43.2%. A mloU score of 47.7% at a latency of 26.92 ms can be achieved with the medium-sized model and modified topology evaluated in INT8 on the DLA. The medium-sized model with modified topology provides good classification accuracy evaluated in FP32 on the GPU with a mloU score of 55.2% in 67.85 ms.
AB - The growing use of LiDAR systems and constrained computing resources in the automotive sector require efficient LiDAR processing. SalsaNext, a convolutional neural network for semantic segmentation, is a promising candidate for deployment in that area. To extend the research regarding its quantization and investigate its adaptability to constrained resources, a design space exploration is performed. The design space, defined by model size, topology, and compute precision, is evaluated on a Jetson AGX Orin regarding classification accuracy, latency, and energy efficiency. The results display a trade-off between classification accuracy and runtime. The smallest model evaluated in INT8 on the GPU provides the smallest latency of 14.48 ms with a mloU score of 43.2%. A mloU score of 47.7% at a latency of 26.92 ms can be achieved with the medium-sized model and modified topology evaluated in INT8 on the DLA. The medium-sized model with modified topology provides good classification accuracy evaluated in FP32 on the GPU with a mloU score of 55.2% in 67.85 ms.
KW - CNN Optimization
KW - CNN Quantization
KW - Design Space Exploration
KW - SalsaNext
KW - Semantic Segmentation
UR - http://www.scopus.com/inward/record.url?scp=85203107748&partnerID=8YFLogxK
U2 - 10.1109/asap61560.2024.00016
DO - 10.1109/asap61560.2024.00016
M3 - Conference contribution
SN - 979-8-3503-4964-1
T3 - IEEE International Conference on Application-Specific Systems, Architectures, and Processors
SP - 28
EP - 29
BT - Proceedings - 2024 IEEE 35th International Conference on Application-Specific Systems, Architectures and Processors, ASAP 2024
ER -