Self-adaptive Executors for Big Data Processing

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

Externe Organisationen

  • Delft University of Technology
Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksMiddleware 2019 - Proceedings of the 2019 20th International Middleware Conference
Herausgeber (Verlag)Association for Computing Machinery (ACM)
Seiten176-188
Seitenumfang13
ISBN (elektronisch)9781450370097
ISBN (Print)9781450370097
PublikationsstatusVeröffentlicht - 13 Sept. 2019
Extern publiziertJa
VeranstaltungACM/IFIP 20th International Middleware Conference - UC Davis, USA / Vereinigte Staaten
Dauer: 9 Dez. 201913 Dez. 2019

Publikationsreihe

NameProceedings of the 20th International Middleware Conference

Abstract

The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

ASJC Scopus Sachgebiete

Zitieren

Self-adaptive Executors for Big Data Processing. / Omranian Khorasani, Sobhan; Rellermeyer, Jan; Epema, Dick.
Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM), 2019. S. 176-188 (Proceedings of the 20th International Middleware Conference).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Omranian Khorasani, S, Rellermeyer, J & Epema, D 2019, Self-adaptive Executors for Big Data Processing. in Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Proceedings of the 20th International Middleware Conference, Association for Computing Machinery (ACM), S. 176-188, ACM/IFIP 20th International Middleware Conference, USA / Vereinigte Staaten, 9 Dez. 2019. https://doi.org/10.1145/3361525.3361545
Omranian Khorasani, S., Rellermeyer, J., & Epema, D. (2019). Self-adaptive Executors for Big Data Processing. In Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference (S. 176-188). (Proceedings of the 20th International Middleware Conference). Association for Computing Machinery (ACM). https://doi.org/10.1145/3361525.3361545
Omranian Khorasani S, Rellermeyer J, Epema D. Self-adaptive Executors for Big Data Processing. in Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM). 2019. S. 176-188. (Proceedings of the 20th International Middleware Conference). doi: 10.1145/3361525.3361545
Omranian Khorasani, Sobhan ; Rellermeyer, Jan ; Epema, Dick. / Self-adaptive Executors for Big Data Processing. Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM), 2019. S. 176-188 (Proceedings of the 20th International Middleware Conference).
Download
@inproceedings{987b5ac65d58481e9bc34bb267b8fbe1,
title = "Self-adaptive Executors for Big Data Processing",
abstract = "The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.",
keywords = "Apache Spark, Big Data, Self-Adaptive Executors",
author = "{Omranian Khorasani}, Sobhan and Jan Rellermeyer and Dick Epema",
note = "Publisher Copyright: {\textcopyright} 2019 Association for Computing Machinery.; ACM/IFIP 20th International Middleware Conference ; Conference date: 09-12-2019 Through 13-12-2019",
year = "2019",
month = sep,
day = "13",
doi = "10.1145/3361525.3361545",
language = "English",
isbn = "9781450370097",
series = "Proceedings of the 20th International Middleware Conference",
publisher = "Association for Computing Machinery (ACM)",
pages = "176--188",
booktitle = "Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference",
address = "United States",

}

Download

TY - GEN

T1 - Self-adaptive Executors for Big Data Processing

AU - Omranian Khorasani, Sobhan

AU - Rellermeyer, Jan

AU - Epema, Dick

N1 - Publisher Copyright: © 2019 Association for Computing Machinery.

PY - 2019/9/13

Y1 - 2019/9/13

N2 - The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

AB - The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

KW - Apache Spark

KW - Big Data

KW - Self-Adaptive Executors

UR - http://www.scopus.com/inward/record.url?scp=85078060450&partnerID=8YFLogxK

U2 - 10.1145/3361525.3361545

DO - 10.1145/3361525.3361545

M3 - Conference contribution

SN - 9781450370097

T3 - Proceedings of the 20th International Middleware Conference

SP - 176

EP - 188

BT - Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference

PB - Association for Computing Machinery (ACM)

T2 - ACM/IFIP 20th International Middleware Conference

Y2 - 9 December 2019 through 13 December 2019

ER -

Von denselben Autoren