Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints

Markus Fidler; Brenton Walker; Yuming Jiang

doi:10.48550/arXiv.1610.06309

Details

Original language	English
Pages (from-to)	1545-1559
Number of pages	15
Journal	IEEE Transactions on Parallel and Distributed Systems
Volume	29
Issue number	7
Early online date	1 Jan 2018
Publication status	Published - 1 Jul 2018

Abstract

Parallel computing has become a standard tool with architectures such as Google MapReduce, Hadoop, and Spark being broadly used in applications such as data processing and machine learning. Common to these systems are a fork operation, where jobs are first divided into tasks that are processed in parallel, and a join operation where completed tasks wait for the other tasks of the job before leaving the system. The synchronization constraint of the join operation makes the analysis of fork-join systems challenging, and few explicit results are known. In this work, we formulate a max-plus server model for parallel systems which allows us to derive performance bounds for a variety of systems in the GI GI and G G cases. We contribute end-to-end delay bounds for multi-stage fork-join networks. We perform a detailed comparison of different multi-server configurations, including an analysis of single-queue fork-join systems that achieve a fundamental performance gain. We compare these results to both simulation and a live Spark system.

Keywords

Hadoop, MapReduce, Parallel computing, performance analysis, spark, stochastic network calculus

ASJC Scopus subject areas

Computer Science(all)
Signal Processing
Computer Science(all)
Hardware and Architecture
Computer Science(all)
Computational Theory and Mathematics

Cite this

Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints. / Fidler, Markus; Walker, Brenton; Jiang, Yuming.
In: IEEE Transactions on Parallel and Distributed Systems, Vol. 29, No. 7, 01.07.2018, p. 1545-1559.

Research output: Contribution to journal › Article › Research › peer review

Fidler, M, Walker, B & Jiang, Y 2018, 'Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints', IEEE Transactions on Parallel and Distributed Systems, vol. 29, no. 7, pp. 1545-1559. https://doi.org/10.48550/arXiv.1610.06309, https://doi.org/10.1109/TPDS.2017.2779872

Fidler, M., Walker, B., & Jiang, Y. (2018). Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints. IEEE Transactions on Parallel and Distributed Systems, 29(7), 1545-1559. https://doi.org/10.48550/arXiv.1610.06309, https://doi.org/10.1109/TPDS.2017.2779872

Fidler M, Walker B, Jiang Y. Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints. IEEE Transactions on Parallel and Distributed Systems. 2018 Jul 1;29(7):1545-1559. Epub 2018 Jan 1. doi: 10.48550/arXiv.1610.06309, 10.1109/TPDS.2017.2779872

Fidler, Markus ; Walker, Brenton ; Jiang, Yuming. / Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints. In: IEEE Transactions on Parallel and Distributed Systems. 2018 ; Vol. 29, No. 7. pp. 1545-1559.

Download

@article{36efc69ae3a14ed69c29bf6a48c37984,

title = "Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints",

abstract = "Parallel computing has become a standard tool with architectures such as Google MapReduce, Hadoop, and Spark being broadly used in applications such as data processing and machine learning. Common to these systems are a fork operation, where jobs are first divided into tasks that are processed in parallel, and a join operation where completed tasks wait for the other tasks of the job before leaving the system. The synchronization constraint of the join operation makes the analysis of fork-join systems challenging, and few explicit results are known. In this work, we formulate a max-plus server model for parallel systems which allows us to derive performance bounds for a variety of systems in the GI GI and G G cases. We contribute end-to-end delay bounds for multi-stage fork-join networks. We perform a detailed comparison of different multi-server configurations, including an analysis of single-queue fork-join systems that achieve a fundamental performance gain. We compare these results to both simulation and a live Spark system.",

keywords = "Hadoop, MapReduce, Parallel computing, performance analysis, spark, stochastic network calculus",

author = "Markus Fidler and Brenton Walker and Yuming Jiang",

note = "Funding information: This work was supported in part by the European Research Council (ERC) under StG 306644.",

year = "2018",

month = jul,

day = "1",

doi = "10.48550/arXiv.1610.06309",

language = "English",

volume = "29",

pages = "1545--1559",

journal = "IEEE Transactions on Parallel and Distributed Systems",

issn = "1045-9219",

publisher = "IEEE Computer Society",

number = "7",

}

Download

TY - JOUR

T1 - Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints

AU - Fidler, Markus

AU - Walker, Brenton

AU - Jiang, Yuming

N1 - Funding information: This work was supported in part by the European Research Council (ERC) under StG 306644.

PY - 2018/7/1

Y1 - 2018/7/1

N2 - Parallel computing has become a standard tool with architectures such as Google MapReduce, Hadoop, and Spark being broadly used in applications such as data processing and machine learning. Common to these systems are a fork operation, where jobs are first divided into tasks that are processed in parallel, and a join operation where completed tasks wait for the other tasks of the job before leaving the system. The synchronization constraint of the join operation makes the analysis of fork-join systems challenging, and few explicit results are known. In this work, we formulate a max-plus server model for parallel systems which allows us to derive performance bounds for a variety of systems in the GI GI and G G cases. We contribute end-to-end delay bounds for multi-stage fork-join networks. We perform a detailed comparison of different multi-server configurations, including an analysis of single-queue fork-join systems that achieve a fundamental performance gain. We compare these results to both simulation and a live Spark system.

AB - Parallel computing has become a standard tool with architectures such as Google MapReduce, Hadoop, and Spark being broadly used in applications such as data processing and machine learning. Common to these systems are a fork operation, where jobs are first divided into tasks that are processed in parallel, and a join operation where completed tasks wait for the other tasks of the job before leaving the system. The synchronization constraint of the join operation makes the analysis of fork-join systems challenging, and few explicit results are known. In this work, we formulate a max-plus server model for parallel systems which allows us to derive performance bounds for a variety of systems in the GI GI and G G cases. We contribute end-to-end delay bounds for multi-stage fork-join networks. We perform a detailed comparison of different multi-server configurations, including an analysis of single-queue fork-join systems that achieve a fundamental performance gain. We compare these results to both simulation and a live Spark system.

KW - Hadoop

KW - MapReduce

KW - Parallel computing

KW - performance analysis

KW - spark

KW - stochastic network calculus

UR - http://www.scopus.com/inward/record.url?scp=85040042465&partnerID=8YFLogxK

U2 - 10.48550/arXiv.1610.06309

DO - 10.48550/arXiv.1610.06309

M3 - Article

AN - SCOPUS:85040042465

VL - 29

SP - 1545

EP - 1559

JO - IEEE Transactions on Parallel and Distributed Systems

JF - IEEE Transactions on Parallel and Distributed Systems

SN - 1045-9219

IS - 7

ER -

Research@Leibniz University

Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints

Authors

Research Organisations

External Research Organisations

Details

Abstract

Keywords

ASJC Scopus subject areas

Cite this

By the same author(s)

Age- and deviation-of-information of hybrid time- and event-triggered systems: What matters more, determinism or resource conservation?

Statistical Age-of-Information Bounds for Parallel Systems: When Do Independent Channels Make a Difference?

The Tiny-Tasks Granularity Trade-Off: Balancing Overhead Versus Performance in Parallel Systems

A Min-plus Model of Age-of-Information with Worst-case and Statistical Bounds

Performance and Scaling of Parallel Systems with Blocking Start and/or Departure Barriers