Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Autoren

Organisationseinheiten

Forschungs-netzwerk anzeigen

Details

OriginalspracheEnglisch
Titel des SammelwerksINFOCOM 2020
Untertitel IEEE Conference on Computer Communications
Herausgeber (Verlag)Institute of Electrical and Electronics Engineers Inc.
Seiten1063-1072
Seitenumfang10
ISBN (elektronisch)9781728164120
ISBN (Print)978-1-7281-6413-7
PublikationsstatusVeröffentlicht - 2020
Veranstaltung38th IEEE Conference on Computer Communications, INFOCOM 2020 - Toronto, Kanada
Dauer: 6 Juli 20209 Juli 2020

Publikationsreihe

NameProceedings - IEEE INFOCOM
Band2020-July
ISSN (Print)0743-166X

Abstract

Models of parallel processing systems typically assume that one has l servers and jobs are split into an equal number of k = l tasks. This seemingly simple approximation has surprisingly large consequences for the resulting stability and performance bounds. In reality, best practices for modern mapreduce systems indicate that a job's partitioning factor should be much larger than the number of servers available, with some researchers going to far as to advocate for a tiny tasks regime, where jobs are split into over 10,000 tasks. In this paper we use recent advances in stochastic network calculus to fundamentally understand the effects of task granularity on parallel systems' scaling, stability, and performance. For the split-merge model, we show that when one allows for tiny tasks, the stability region is actually much better than had previously been concluded. For the single-queue fork-join model, we show that sojourn times quickly approach the optimal case when l big tasks are subdivided into k≫ l tiny tasks. Our results are validated using extensive simulations, and the applicability of the models used is validated by experiments on an Apache Spark cluster.

ASJC Scopus Sachgebiete

Zitieren

Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems. / Fidler, Markus; Walker, Brenton; Bora, Stefan.
INFOCOM 2020 : IEEE Conference on Computer Communications. Institute of Electrical and Electronics Engineers Inc., 2020. S. 1063-1072 9155368 (Proceedings - IEEE INFOCOM; Band 2020-July).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/KonferenzbandAufsatz in KonferenzbandForschungPeer-Review

Fidler, M, Walker, B & Bora, S 2020, Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems. in INFOCOM 2020 : IEEE Conference on Computer Communications., 9155368, Proceedings - IEEE INFOCOM, Bd. 2020-July, Institute of Electrical and Electronics Engineers Inc., S. 1063-1072, 38th IEEE Conference on Computer Communications, INFOCOM 2020, Toronto, Kanada, 6 Juli 2020. https://doi.org/10.1109/infocom41043.2020.9155368
Fidler, M., Walker, B., & Bora, S. (2020). Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems. In INFOCOM 2020 : IEEE Conference on Computer Communications (S. 1063-1072). Artikel 9155368 (Proceedings - IEEE INFOCOM; Band 2020-July). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/infocom41043.2020.9155368
Fidler M, Walker B, Bora S. Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems. in INFOCOM 2020 : IEEE Conference on Computer Communications. Institute of Electrical and Electronics Engineers Inc. 2020. S. 1063-1072. 9155368. (Proceedings - IEEE INFOCOM). doi: 10.1109/infocom41043.2020.9155368
Fidler, Markus ; Walker, Brenton ; Bora, Stefan. / Tiny Tasks : A Remedy for Synchronization Constraints in Multi-Server Systems. INFOCOM 2020 : IEEE Conference on Computer Communications. Institute of Electrical and Electronics Engineers Inc., 2020. S. 1063-1072 (Proceedings - IEEE INFOCOM).
Download
@inproceedings{62e97c90f40840ebaee7ebe5c55f8030,
title = "Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems",
abstract = "Models of parallel processing systems typically assume that one has l servers and jobs are split into an equal number of k = l tasks. This seemingly simple approximation has surprisingly large consequences for the resulting stability and performance bounds. In reality, best practices for modern mapreduce systems indicate that a job's partitioning factor should be much larger than the number of servers available, with some researchers going to far as to advocate for a tiny tasks regime, where jobs are split into over 10,000 tasks. In this paper we use recent advances in stochastic network calculus to fundamentally understand the effects of task granularity on parallel systems' scaling, stability, and performance. For the split-merge model, we show that when one allows for tiny tasks, the stability region is actually much better than had previously been concluded. For the single-queue fork-join model, we show that sojourn times quickly approach the optimal case when l big tasks are subdivided into k≫ l tiny tasks. Our results are validated using extensive simulations, and the applicability of the models used is validated by experiments on an Apache Spark cluster.",
author = "Markus Fidler and Brenton Walker and Stefan Bora",
note = "Funding information: This work was supported in part by the German Research Council (DFG) under Grant VaMoS (FI 1236/7-1).; 38th IEEE Conference on Computer Communications, INFOCOM 2020 ; Conference date: 06-07-2020 Through 09-07-2020",
year = "2020",
doi = "10.1109/infocom41043.2020.9155368",
language = "English",
isbn = "978-1-7281-6413-7",
series = "Proceedings - IEEE INFOCOM",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "1063--1072",
booktitle = "INFOCOM 2020",
address = "United States",

}

Download

TY - GEN

T1 - Tiny Tasks

T2 - 38th IEEE Conference on Computer Communications, INFOCOM 2020

AU - Fidler, Markus

AU - Walker, Brenton

AU - Bora, Stefan

N1 - Funding information: This work was supported in part by the German Research Council (DFG) under Grant VaMoS (FI 1236/7-1).

PY - 2020

Y1 - 2020

N2 - Models of parallel processing systems typically assume that one has l servers and jobs are split into an equal number of k = l tasks. This seemingly simple approximation has surprisingly large consequences for the resulting stability and performance bounds. In reality, best practices for modern mapreduce systems indicate that a job's partitioning factor should be much larger than the number of servers available, with some researchers going to far as to advocate for a tiny tasks regime, where jobs are split into over 10,000 tasks. In this paper we use recent advances in stochastic network calculus to fundamentally understand the effects of task granularity on parallel systems' scaling, stability, and performance. For the split-merge model, we show that when one allows for tiny tasks, the stability region is actually much better than had previously been concluded. For the single-queue fork-join model, we show that sojourn times quickly approach the optimal case when l big tasks are subdivided into k≫ l tiny tasks. Our results are validated using extensive simulations, and the applicability of the models used is validated by experiments on an Apache Spark cluster.

AB - Models of parallel processing systems typically assume that one has l servers and jobs are split into an equal number of k = l tasks. This seemingly simple approximation has surprisingly large consequences for the resulting stability and performance bounds. In reality, best practices for modern mapreduce systems indicate that a job's partitioning factor should be much larger than the number of servers available, with some researchers going to far as to advocate for a tiny tasks regime, where jobs are split into over 10,000 tasks. In this paper we use recent advances in stochastic network calculus to fundamentally understand the effects of task granularity on parallel systems' scaling, stability, and performance. For the split-merge model, we show that when one allows for tiny tasks, the stability region is actually much better than had previously been concluded. For the single-queue fork-join model, we show that sojourn times quickly approach the optimal case when l big tasks are subdivided into k≫ l tiny tasks. Our results are validated using extensive simulations, and the applicability of the models used is validated by experiments on an Apache Spark cluster.

UR - http://www.scopus.com/inward/record.url?scp=85090297693&partnerID=8YFLogxK

U2 - 10.1109/infocom41043.2020.9155368

DO - 10.1109/infocom41043.2020.9155368

M3 - Conference contribution

AN - SCOPUS:85090297693

SN - 978-1-7281-6413-7

T3 - Proceedings - IEEE INFOCOM

SP - 1063

EP - 1072

BT - INFOCOM 2020

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 6 July 2020 through 9 July 2020

ER -

Von denselben Autoren