Evaluation of sparse linear algebra operations in Trilinos

Mohammad Siahatgar; Gabriele Von Voigt

doi:10.7712/100016.1893.11500

Details

Original language	English
Title of host publication	ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering
Editors	G. Stefanou, V. Papadopoulos, V. Plevris, M. Papadrakakis
Publisher	National Technical University of Athens
Pages	1381-1391
Number of pages	11
ISBN (electronic)	9786188284401
Publication status	Published - 2016
Event	7th European Congress on Computational Methods in Applied Sciences and Engineering, ECCOMAS Congress 2016 - Crete, Greece Duration: 5 Jun 2016 → 10 Jun 2016

Publication series

Name	ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering
Volume	1

Abstract

The performance of numerous scientific libraries and applications depends heavily on efficiency of sparse linear algebra operations. In this paper, we survey the performance of several parallel sparse vector and matrix kernels provided in the Trilinos framework on supercomputer systems Cray XC30/40 and IBM Blue Gene/Q. The linear algebra operations in Trilinos are handled by one of the two packages Epetra or Tpetra. While the former is the mostused, the latter is the target of future developments and supports larger scale problems as well as shared memory parallelism. We compare the results obtained from both packages together with the MPI only and hybrid solutions. The hybrid parallelism is managed by the package Kokkos, which aims for performance portability among different architectures. We report the efficiency of a single node of the system and demonstrate the scalability behavior of the benchmarks up to 38,400 cores of the HLRN-III systems. Furthermore, for the Intel processors used in the Cray system we present measurements of the energy consumption of the kernels and compare the Energy-to-Solution between different compilers and parallel programing paradigms. In addition, we discuss the effect on the performance and the energy consumption by linking the vendor provided libraries compared to the user-compiled versions. These extensive comparisons obtained on the top most performant supercomputer systems help users and developers as a starting point for determining an optimal development strategy.

Keywords

Energy consumption, HPC, Performance evaluation, Sparse algebra, Trilinos

ASJC Scopus subject areas

Computer Science(all)
Artificial Intelligence
Mathematics(all)
Applied Mathematics

Sustainable Development Goals

SDG 7 - Affordable and Clean Energy

Cite this

Evaluation of sparse linear algebra operations in Trilinos. / Siahatgar, Mohammad; Von Voigt, Gabriele.
ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering. ed. / G. Stefanou; V. Papadopoulos; V. Plevris; M. Papadrakakis. National Technical University of Athens, 2016. p. 1381-1391 (ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering; Vol. 1).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Siahatgar, M & Von Voigt, G 2016, Evaluation of sparse linear algebra operations in Trilinos. in G Stefanou, V Papadopoulos, V Plevris & M Papadrakakis (eds), ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering. ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering, vol. 1, National Technical University of Athens, pp. 1381-1391, 7th European Congress on Computational Methods in Applied Sciences and Engineering, ECCOMAS Congress 2016, Crete, Greece, 5 Jun 2016. https://doi.org/10.7712/100016.1893.11500

Siahatgar, M., & Von Voigt, G. (2016). Evaluation of sparse linear algebra operations in Trilinos. In G. Stefanou, V. Papadopoulos, V. Plevris, & M. Papadrakakis (Eds.), ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering (pp. 1381-1391). (ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering; Vol. 1). National Technical University of Athens. https://doi.org/10.7712/100016.1893.11500

Siahatgar M, Von Voigt G. Evaluation of sparse linear algebra operations in Trilinos. In Stefanou G, Papadopoulos V, Plevris V, Papadrakakis M, editors, ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering. National Technical University of Athens. 2016. p. 1381-1391. (ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering). doi: 10.7712/100016.1893.11500

Siahatgar, Mohammad ; Von Voigt, Gabriele. / Evaluation of sparse linear algebra operations in Trilinos. ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering. editor / G. Stefanou ; V. Papadopoulos ; V. Plevris ; M. Papadrakakis. National Technical University of Athens, 2016. pp. 1381-1391 (ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering).

Download

@inproceedings{45aa3e9029f547d9bca4e4952fb90a0f,

title = "Evaluation of sparse linear algebra operations in Trilinos",

abstract = "The performance of numerous scientific libraries and applications depends heavily on efficiency of sparse linear algebra operations. In this paper, we survey the performance of several parallel sparse vector and matrix kernels provided in the Trilinos framework on supercomputer systems Cray XC30/40 and IBM Blue Gene/Q. The linear algebra operations in Trilinos are handled by one of the two packages Epetra or Tpetra. While the former is the mostused, the latter is the target of future developments and supports larger scale problems as well as shared memory parallelism. We compare the results obtained from both packages together with the MPI only and hybrid solutions. The hybrid parallelism is managed by the package Kokkos, which aims for performance portability among different architectures. We report the efficiency of a single node of the system and demonstrate the scalability behavior of the benchmarks up to 38,400 cores of the HLRN-III systems. Furthermore, for the Intel processors used in the Cray system we present measurements of the energy consumption of the kernels and compare the Energy-to-Solution between different compilers and parallel programing paradigms. In addition, we discuss the effect on the performance and the energy consumption by linking the vendor provided libraries compared to the user-compiled versions. These extensive comparisons obtained on the top most performant supercomputer systems help users and developers as a starting point for determining an optimal development strategy.",

keywords = "Energy consumption, HPC, Performance evaluation, Sparse algebra, Trilinos",

author = "Mohammad Siahatgar and {Von Voigt}, Gabriele",

note = "Funding Information: This work has been funded by the European Research Council under the FP7 NUMEXAS project under grant agreement 611636. The authors gratefully acknowledge the Gauss Centre for Supercomputing (GCS) for providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS share of the supercomputer JUQUEEN [16] at J{\"u}lich Supercomputing Centre (JSC). GCS is the alliance of the three national supercomputing centers HLRS (Universit{\"a}t Stuttgart), JSC (Forschungszentrum J{\"u}lich), and LRZ (Bayerische Akademie der Wissenschaften), funded by the German Federal Ministry of Education and Research (BMBF) and the German State Ministries for Research of Baden-W{\"u}rttemberg (MWK), Bayern (StMWFK) and Nordrhein-Westfalen (MIWF). The authors would like to thank the anonymous referee for the comments.; 7th European Congress on Computational Methods in Applied Sciences and Engineering, ECCOMAS Congress 2016 ; Conference date: 05-06-2016 Through 10-06-2016",

year = "2016",

doi = "10.7712/100016.1893.11500",

language = "English",

series = "ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering",

publisher = "National Technical University of Athens",

pages = "1381--1391",

editor = "G. Stefanou and V. Papadopoulos and V. Plevris and M. Papadrakakis",

booktitle = "ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering",

address = "Greece",

}

Download

TY - GEN

T1 - Evaluation of sparse linear algebra operations in Trilinos

AU - Siahatgar, Mohammad

AU - Von Voigt, Gabriele

N1 - Funding Information: This work has been funded by the European Research Council under the FP7 NUMEXAS project under grant agreement 611636. The authors gratefully acknowledge the Gauss Centre for Supercomputing (GCS) for providing computing time through the John von Neumann Institute for Computing (NIC) on the GCS share of the supercomputer JUQUEEN [16] at Jülich Supercomputing Centre (JSC). GCS is the alliance of the three national supercomputing centers HLRS (Universität Stuttgart), JSC (Forschungszentrum Jülich), and LRZ (Bayerische Akademie der Wissenschaften), funded by the German Federal Ministry of Education and Research (BMBF) and the German State Ministries for Research of Baden-Württemberg (MWK), Bayern (StMWFK) and Nordrhein-Westfalen (MIWF). The authors would like to thank the anonymous referee for the comments.

PY - 2016

Y1 - 2016

N2 - The performance of numerous scientific libraries and applications depends heavily on efficiency of sparse linear algebra operations. In this paper, we survey the performance of several parallel sparse vector and matrix kernels provided in the Trilinos framework on supercomputer systems Cray XC30/40 and IBM Blue Gene/Q. The linear algebra operations in Trilinos are handled by one of the two packages Epetra or Tpetra. While the former is the mostused, the latter is the target of future developments and supports larger scale problems as well as shared memory parallelism. We compare the results obtained from both packages together with the MPI only and hybrid solutions. The hybrid parallelism is managed by the package Kokkos, which aims for performance portability among different architectures. We report the efficiency of a single node of the system and demonstrate the scalability behavior of the benchmarks up to 38,400 cores of the HLRN-III systems. Furthermore, for the Intel processors used in the Cray system we present measurements of the energy consumption of the kernels and compare the Energy-to-Solution between different compilers and parallel programing paradigms. In addition, we discuss the effect on the performance and the energy consumption by linking the vendor provided libraries compared to the user-compiled versions. These extensive comparisons obtained on the top most performant supercomputer systems help users and developers as a starting point for determining an optimal development strategy.

AB - The performance of numerous scientific libraries and applications depends heavily on efficiency of sparse linear algebra operations. In this paper, we survey the performance of several parallel sparse vector and matrix kernels provided in the Trilinos framework on supercomputer systems Cray XC30/40 and IBM Blue Gene/Q. The linear algebra operations in Trilinos are handled by one of the two packages Epetra or Tpetra. While the former is the mostused, the latter is the target of future developments and supports larger scale problems as well as shared memory parallelism. We compare the results obtained from both packages together with the MPI only and hybrid solutions. The hybrid parallelism is managed by the package Kokkos, which aims for performance portability among different architectures. We report the efficiency of a single node of the system and demonstrate the scalability behavior of the benchmarks up to 38,400 cores of the HLRN-III systems. Furthermore, for the Intel processors used in the Cray system we present measurements of the energy consumption of the kernels and compare the Energy-to-Solution between different compilers and parallel programing paradigms. In addition, we discuss the effect on the performance and the energy consumption by linking the vendor provided libraries compared to the user-compiled versions. These extensive comparisons obtained on the top most performant supercomputer systems help users and developers as a starting point for determining an optimal development strategy.

KW - Energy consumption

KW - HPC

KW - Performance evaluation

KW - Sparse algebra

KW - Trilinos

UR - http://www.scopus.com/inward/record.url?scp=84995468920&partnerID=8YFLogxK

U2 - 10.7712/100016.1893.11500

DO - 10.7712/100016.1893.11500

M3 - Conference contribution

AN - SCOPUS:84995468920

T3 - ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering

SP - 1381

EP - 1391

BT - ECCOMAS Congress 2016 - Proceedings of the 7th European Congress on Computational Methods in Applied Sciences and Engineering

A2 - Stefanou, G.

A2 - Papadopoulos, V.

A2 - Plevris, V.

A2 - Papadrakakis, M.

PB - National Technical University of Athens

T2 - 7th European Congress on Computational Methods in Applied Sciences and Engineering, ECCOMAS Congress 2016

Y2 - 5 June 2016 through 10 June 2016

ER -

Research@Leibniz University

Evaluation of sparse linear algebra operations in Trilinos

Authors

Research Organisations

Details

Publication series

Abstract

Keywords

ASJC Scopus subject areas

Sustainable Development Goals

Cite this

By the same author(s)

Evolutionary image simplification for lung nodule classification with convolutional neural networks

Evolutionary structure minimization of deep neural networks for motion sensor data

Multi-Stage Deep Learning for Context-Free Handwriting Recognition

Comparison of statistical learning approaches for cerebral aneurysm rupture assessment

National Nodes: Getting organised; how far are we?

Evolutionary image simplification for lung nodule classification with convolutional neural networks

Evolutionary structure minimization of deep neural networks for motion sensor data

Multi-Stage Deep Learning for Context-Free Handwriting Recognition

Comparison of statistical learning approaches for cerebral aneurysm rupture assessment

National Nodes: Getting organised; how far are we?

Evolutionary image simplification for lung nodule classification with convolutional neural networks