Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems

Markus Fidler; Brenton Walker; Stefan Bora

doi:10.1109/infocom41043.2020.9155368

Details

Original language	English
Title of host publication	INFOCOM 2020
Subtitle of host publication	IEEE Conference on Computer Communications
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	1063-1072
Number of pages	10
ISBN (electronic)	9781728164120
ISBN (print)	978-1-7281-6413-7
Publication status	Published - 2020
Event	38th IEEE Conference on Computer Communications, INFOCOM 2020 - Toronto, Canada Duration: 6 Jul 2020 → 9 Jul 2020

Publication series

Name	Proceedings - IEEE INFOCOM
Volume	2020-July
ISSN (Print)	0743-166X

Abstract

Models of parallel processing systems typically assume that one has l servers and jobs are split into an equal number of k = l tasks. This seemingly simple approximation has surprisingly large consequences for the resulting stability and performance bounds. In reality, best practices for modern mapreduce systems indicate that a job's partitioning factor should be much larger than the number of servers available, with some researchers going to far as to advocate for a tiny tasks regime, where jobs are split into over 10,000 tasks. In this paper we use recent advances in stochastic network calculus to fundamentally understand the effects of task granularity on parallel systems' scaling, stability, and performance. For the split-merge model, we show that when one allows for tiny tasks, the stability region is actually much better than had previously been concluded. For the single-queue fork-join model, we show that sojourn times quickly approach the optimal case when l big tasks are subdivided into k≫ l tiny tasks. Our results are validated using extensive simulations, and the applicability of the models used is validated by experiments on an Apache Spark cluster.

ASJC Scopus subject areas

Computer Science(all)
Engineering(all)
Electrical and Electronic Engineering

Cite this

Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems. / Fidler, Markus; Walker, Brenton; Bora, Stefan.
INFOCOM 2020 : IEEE Conference on Computer Communications. Institute of Electrical and Electronics Engineers Inc., 2020. p. 1063-1072 9155368 (Proceedings - IEEE INFOCOM; Vol. 2020-July).

Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review

Fidler, M, Walker, B & Bora, S 2020, Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems. in INFOCOM 2020 : IEEE Conference on Computer Communications., 9155368, Proceedings - IEEE INFOCOM, vol. 2020-July, Institute of Electrical and Electronics Engineers Inc., pp. 1063-1072, 38th IEEE Conference on Computer Communications, INFOCOM 2020, Toronto, Canada, 6 Jul 2020. https://doi.org/10.1109/infocom41043.2020.9155368

Fidler, M., Walker, B., & Bora, S. (2020). Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems. In INFOCOM 2020 : IEEE Conference on Computer Communications (pp. 1063-1072). Article 9155368 (Proceedings - IEEE INFOCOM; Vol. 2020-July). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/infocom41043.2020.9155368

Fidler M, Walker B, Bora S. Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems. In INFOCOM 2020 : IEEE Conference on Computer Communications. Institute of Electrical and Electronics Engineers Inc. 2020. p. 1063-1072. 9155368. (Proceedings - IEEE INFOCOM). doi: 10.1109/infocom41043.2020.9155368

Fidler, Markus ; Walker, Brenton ; Bora, Stefan. / Tiny Tasks : A Remedy for Synchronization Constraints in Multi-Server Systems. INFOCOM 2020 : IEEE Conference on Computer Communications. Institute of Electrical and Electronics Engineers Inc., 2020. pp. 1063-1072 (Proceedings - IEEE INFOCOM).

Download

@inproceedings{62e97c90f40840ebaee7ebe5c55f8030,

title = "Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems",

abstract = "Models of parallel processing systems typically assume that one has l servers and jobs are split into an equal number of k = l tasks. This seemingly simple approximation has surprisingly large consequences for the resulting stability and performance bounds. In reality, best practices for modern mapreduce systems indicate that a job's partitioning factor should be much larger than the number of servers available, with some researchers going to far as to advocate for a tiny tasks regime, where jobs are split into over 10,000 tasks. In this paper we use recent advances in stochastic network calculus to fundamentally understand the effects of task granularity on parallel systems' scaling, stability, and performance. For the split-merge model, we show that when one allows for tiny tasks, the stability region is actually much better than had previously been concluded. For the single-queue fork-join model, we show that sojourn times quickly approach the optimal case when l big tasks are subdivided into k≫ l tiny tasks. Our results are validated using extensive simulations, and the applicability of the models used is validated by experiments on an Apache Spark cluster.",

author = "Markus Fidler and Brenton Walker and Stefan Bora",

note = "Funding information: This work was supported in part by the German Research Council (DFG) under Grant VaMoS (FI 1236/7-1).; 38th IEEE Conference on Computer Communications, INFOCOM 2020 ; Conference date: 06-07-2020 Through 09-07-2020",

year = "2020",

doi = "10.1109/infocom41043.2020.9155368",

language = "English",

isbn = "978-1-7281-6413-7",

series = "Proceedings - IEEE INFOCOM",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "1063--1072",

booktitle = "INFOCOM 2020",

address = "United States",

}

Download

TY - GEN

T1 - Tiny Tasks

T2 - 38th IEEE Conference on Computer Communications, INFOCOM 2020

AU - Fidler, Markus

AU - Walker, Brenton

AU - Bora, Stefan

N1 - Funding information: This work was supported in part by the German Research Council (DFG) under Grant VaMoS (FI 1236/7-1).

PY - 2020

Y1 - 2020

N2 - Models of parallel processing systems typically assume that one has l servers and jobs are split into an equal number of k = l tasks. This seemingly simple approximation has surprisingly large consequences for the resulting stability and performance bounds. In reality, best practices for modern mapreduce systems indicate that a job's partitioning factor should be much larger than the number of servers available, with some researchers going to far as to advocate for a tiny tasks regime, where jobs are split into over 10,000 tasks. In this paper we use recent advances in stochastic network calculus to fundamentally understand the effects of task granularity on parallel systems' scaling, stability, and performance. For the split-merge model, we show that when one allows for tiny tasks, the stability region is actually much better than had previously been concluded. For the single-queue fork-join model, we show that sojourn times quickly approach the optimal case when l big tasks are subdivided into k≫ l tiny tasks. Our results are validated using extensive simulations, and the applicability of the models used is validated by experiments on an Apache Spark cluster.

AB - Models of parallel processing systems typically assume that one has l servers and jobs are split into an equal number of k = l tasks. This seemingly simple approximation has surprisingly large consequences for the resulting stability and performance bounds. In reality, best practices for modern mapreduce systems indicate that a job's partitioning factor should be much larger than the number of servers available, with some researchers going to far as to advocate for a tiny tasks regime, where jobs are split into over 10,000 tasks. In this paper we use recent advances in stochastic network calculus to fundamentally understand the effects of task granularity on parallel systems' scaling, stability, and performance. For the split-merge model, we show that when one allows for tiny tasks, the stability region is actually much better than had previously been concluded. For the single-queue fork-join model, we show that sojourn times quickly approach the optimal case when l big tasks are subdivided into k≫ l tiny tasks. Our results are validated using extensive simulations, and the applicability of the models used is validated by experiments on an Apache Spark cluster.

UR - http://www.scopus.com/inward/record.url?scp=85090297693&partnerID=8YFLogxK

U2 - 10.1109/infocom41043.2020.9155368

DO - 10.1109/infocom41043.2020.9155368

M3 - Conference contribution

AN - SCOPUS:85090297693

SN - 978-1-7281-6413-7

T3 - Proceedings - IEEE INFOCOM

SP - 1063

EP - 1072

BT - INFOCOM 2020

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 6 July 2020 through 9 July 2020

ER -

Research@Leibniz University

Tiny Tasks: A Remedy for Synchronization Constraints in Multi-Server Systems

Authors

Research Organisations

Details

Publication series

Abstract

ASJC Scopus subject areas

Cite this

By the same author(s)

Age- and deviation-of-information of hybrid time- and event-triggered systems: What matters more, determinism or resource conservation?

Statistical Age-of-Information Bounds for Parallel Systems: When Do Independent Channels Make a Difference?

The Tiny-Tasks Granularity Trade-Off: Balancing Overhead Versus Performance in Parallel Systems

A Min-plus Model of Age-of-Information with Worst-case and Statistical Bounds

Performance and Scaling of Parallel Systems with Blocking Start and/or Departure Barriers