Details
Original language | English |
---|---|
Title of host publication | ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering |
Editors | G. Stefanou, V. Papadopoulos, V. Plevris, M. Papadrakakis |
Pages | 1381-1391 |
Number of pages | 11 |
ISBN (electronic) | 9786188284401 |
Publication status | Published - 2016 |
Event | 7th European Congress on Computational Methods in Applied Sciences and Engineering, ECCOMAS Congress 2016 - Crete, Greece Duration: 5 Jun 2016 → 10 Jun 2016 |
Publication series
Name | ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering |
---|---|
Volume | 1 |
Abstract
The performance of numerous scientific libraries and applications depends heavily on efficiency of sparse linear algebra operations. In this paper, we survey the performance of several parallel sparse vector and matrix kernels provided in the Trilinos framework on supercomputer systems Cray XC30/40 and IBM Blue Gene/Q. The linear algebra operations in Trilinos are handled by one of the two packages Epetra or Tpetra. While the former is the mostused, the latter is the target of future developments and supports larger scale problems as well as shared memory parallelism. We compare the results obtained from both packages together with the MPI only and hybrid solutions. The hybrid parallelism is managed by the package Kokkos, which aims for performance portability among different architectures. We report the efficiency of a single node of the system and demonstrate the scalability behavior of the benchmarks up to 38,400 cores of the HLRN-III systems. Furthermore, for the Intel processors used in the Cray system we present measurements of the energy consumption of the kernels and compare the Energy-to-Solution between different compilers and parallel programing paradigms. In addition, we discuss the effect on the performance and the energy consumption by linking the vendor provided libraries compared to the user-compiled versions. These extensive comparisons obtained on the top most performant supercomputer systems help users and developers as a starting point for determining an optimal development strategy.
Keywords
- Energy consumption, HPC, Performance evaluation, Sparse algebra, Trilinos
ASJC Scopus subject areas
- Computer Science(all)
- Artificial Intelligence
- Mathematics(all)
- Applied Mathematics
Sustainable Development Goals
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering. ed. / G. Stefanou; V. Papadopoulos; V. Plevris; M. Papadrakakis. 2016. p. 1381-1391 (ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering; Vol. 1).
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - Evaluation of sparse linear algebra operations in Trilinos
AU - Siahatgar, Mohammad
AU - Von Voigt, Gabriele
N1 - Funding Information: This work has been funded by the European Research Council under the FP7 NUMEXAS project under grant agreement 611636. The authors gratefully acknowledge the Gauss Centre for Supercomputing (GCS) for providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS share of the supercomputer JUQUEEN [16] at Jülich Supercomputing Centre (JSC). GCS is the alliance of the three national supercomputing centers HLRS (Universität Stuttgart), JSC (Forschungszentrum Jülich), and LRZ (Bayerische Akademie der Wissenschaften), funded by the German Federal Ministry of Education and Research (BMBF) and the German State Ministries for Research of Baden-Württemberg (MWK), Bayern (StMWFK) and Nordrhein-Westfalen (MIWF). The authors would like to thank the anonymous referee for the comments.
PY - 2016
Y1 - 2016
N2 - The performance of numerous scientific libraries and applications depends heavily on efficiency of sparse linear algebra operations. In this paper, we survey the performance of several parallel sparse vector and matrix kernels provided in the Trilinos framework on supercomputer systems Cray XC30/40 and IBM Blue Gene/Q. The linear algebra operations in Trilinos are handled by one of the two packages Epetra or Tpetra. While the former is the mostused, the latter is the target of future developments and supports larger scale problems as well as shared memory parallelism. We compare the results obtained from both packages together with the MPI only and hybrid solutions. The hybrid parallelism is managed by the package Kokkos, which aims for performance portability among different architectures. We report the efficiency of a single node of the system and demonstrate the scalability behavior of the benchmarks up to 38,400 cores of the HLRN-III systems. Furthermore, for the Intel processors used in the Cray system we present measurements of the energy consumption of the kernels and compare the Energy-to-Solution between different compilers and parallel programing paradigms. In addition, we discuss the effect on the performance and the energy consumption by linking the vendor provided libraries compared to the user-compiled versions. These extensive comparisons obtained on the top most performant supercomputer systems help users and developers as a starting point for determining an optimal development strategy.
AB - The performance of numerous scientific libraries and applications depends heavily on efficiency of sparse linear algebra operations. In this paper, we survey the performance of several parallel sparse vector and matrix kernels provided in the Trilinos framework on supercomputer systems Cray XC30/40 and IBM Blue Gene/Q. The linear algebra operations in Trilinos are handled by one of the two packages Epetra or Tpetra. While the former is the mostused, the latter is the target of future developments and supports larger scale problems as well as shared memory parallelism. We compare the results obtained from both packages together with the MPI only and hybrid solutions. The hybrid parallelism is managed by the package Kokkos, which aims for performance portability among different architectures. We report the efficiency of a single node of the system and demonstrate the scalability behavior of the benchmarks up to 38,400 cores of the HLRN-III systems. Furthermore, for the Intel processors used in the Cray system we present measurements of the energy consumption of the kernels and compare the Energy-to-Solution between different compilers and parallel programing paradigms. In addition, we discuss the effect on the performance and the energy consumption by linking the vendor provided libraries compared to the user-compiled versions. These extensive comparisons obtained on the top most performant supercomputer systems help users and developers as a starting point for determining an optimal development strategy.
KW - Energy consumption
KW - HPC
KW - Performance evaluation
KW - Sparse algebra
KW - Trilinos
UR - http://www.scopus.com/inward/record.url?scp=84995468920&partnerID=8YFLogxK
U2 - 10.7712/100016.1893.11500
DO - 10.7712/100016.1893.11500
M3 - Conference contribution
AN - SCOPUS:84995468920
T3 - ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering
SP - 1381
EP - 1391
BT - ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering
A2 - Stefanou, G.
A2 - Papadopoulos, V.
A2 - Plevris, V.
A2 - Papadrakakis, M.
T2 - 7th European Congress on Computational Methods in Applied Sciences and Engineering, ECCOMAS Congress 2016
Y2 - 5 June 2016 through 10 June 2016
ER -