Self-adaptive Executors for Big Data Processing

Sobhan Omranian Khorasani; Jan Rellermeyer; Dick Epema

doi:10.1145/3361525.3361545

Details

Originalsprache	Englisch
Titel des Sammelwerks	Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference
Herausgeber (Verlag)	Association for Computing Machinery (ACM)
Seiten	176-188
Seitenumfang	13
ISBN (elektronisch)	9781450370097
ISBN (Print)	9781450370097
Publikationsstatus	Veröffentlicht - 13 Sept. 2019
Extern publiziert	Ja
Veranstaltung	ACM/IFIP 20th International Middleware Conference - UC Davis, USA / Vereinigte Staaten Dauer: 9 Dez. 2019 → 13 Dez. 2019

Publikationsreihe

Name	Proceedings of the 20th International Middleware Conference

Abstract

The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

ASJC Scopus Sachgebiete

Informatik (insg.)
Software

Zitieren

Self-adaptive Executors for Big Data Processing. / Omranian Khorasani, Sobhan; Rellermeyer, Jan; Epema, Dick.
Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM), 2019. S. 176-188 (Proceedings of the 20th International Middleware Conference).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Omranian Khorasani, S, Rellermeyer, J & Epema, D 2019, Self-adaptive Executors for Big Data Processing. in Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Proceedings of the 20th International Middleware Conference, Association for Computing Machinery (ACM), S. 176-188, ACM/IFIP 20th International Middleware Conference, USA / Vereinigte Staaten, 9 Dez. 2019. https://doi.org/10.1145/3361525.3361545

Omranian Khorasani, S., Rellermeyer, J., & Epema, D. (2019). Self-adaptive Executors for Big Data Processing. In Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference (S. 176-188). (Proceedings of the 20th International Middleware Conference). Association for Computing Machinery (ACM). https://doi.org/10.1145/3361525.3361545

Omranian Khorasani S, Rellermeyer J, Epema D. Self-adaptive Executors for Big Data Processing. in Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM). 2019. S. 176-188. (Proceedings of the 20th International Middleware Conference). doi: 10.1145/3361525.3361545

Omranian Khorasani, Sobhan ; Rellermeyer, Jan ; Epema, Dick. / Self-adaptive Executors for Big Data Processing. Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference. Association for Computing Machinery (ACM), 2019. S. 176-188 (Proceedings of the 20th International Middleware Conference).

Download

@inproceedings{987b5ac65d58481e9bc34bb267b8fbe1,

title = "Self-adaptive Executors for Big Data Processing",

abstract = "The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.",

keywords = "Apache Spark, Big Data, Self-Adaptive Executors",

author = "{Omranian Khorasani}, Sobhan and Jan Rellermeyer and Dick Epema",

note = "Publisher Copyright: {\textcopyright} 2019 Association for Computing Machinery.; ACM/IFIP 20th International Middleware Conference ; Conference date: 09-12-2019 Through 13-12-2019",

year = "2019",

month = sep,

day = "13",

doi = "10.1145/3361525.3361545",

language = "English",

isbn = "9781450370097",

series = "Proceedings of the 20th International Middleware Conference",

publisher = "Association for Computing Machinery (ACM)",

pages = "176--188",

booktitle = "Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference",

address = "United States",

}

Download

TY - GEN

T1 - Self-adaptive Executors for Big Data Processing

AU - Omranian Khorasani, Sobhan

AU - Rellermeyer, Jan

AU - Epema, Dick

PY - 2019/9/13

Y1 - 2019/9/13

N2 - The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

AB - The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.

KW - Apache Spark

KW - Big Data

KW - Self-Adaptive Executors

UR - http://www.scopus.com/inward/record.url?scp=85078060450&partnerID=8YFLogxK

U2 - 10.1145/3361525.3361545

DO - 10.1145/3361525.3361545

M3 - Conference contribution

SN - 9781450370097

T3 - Proceedings of the 20th International Middleware Conference

SP - 176

EP - 188

BT - Middleware 2019 - Proceedings of the 2019 20th International Middleware Conference

PB - Association for Computing Machinery (ACM)

T2 - ACM/IFIP 20th International Middleware Conference

Y2 - 9 December 2019 through 13 December 2019

ER -

Research@Leibniz University

Self-adaptive Executors for Big Data Processing

Autorschaft

Externe Organisationen

Details

Publikationsreihe

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Brug: An Adaptive Memory (Re-)Allocator

Is Your Anomaly Detector Ready for Change? Adapting AIOps Solutions to the Real World

Toward Competitive Serverless Deep Learning

The Performance of Distributed Applications: A Traffic Shaping Perspective

Log Parsing Evaluation in the Era of Modern Software Systems