Details
Originalsprache | Englisch |
---|---|
Seiten (von - bis) | 1545-1559 |
Seitenumfang | 15 |
Fachzeitschrift | IEEE Transactions on Parallel and Distributed Systems |
Jahrgang | 29 |
Ausgabenummer | 7 |
Frühes Online-Datum | 1 Jan. 2018 |
Publikationsstatus | Veröffentlicht - 1 Juli 2018 |
Abstract
Parallel computing has become a standard tool with architectures such as Google MapReduce, Hadoop, and Spark being broadly used in applications such as data processing and machine learning. Common to these systems are a fork operation, where jobs are first divided into tasks that are processed in parallel, and a join operation where completed tasks wait for the other tasks of the job before leaving the system. The synchronization constraint of the join operation makes the analysis of fork-join systems challenging, and few explicit results are known. In this work, we formulate a max-plus server model for parallel systems which allows us to derive performance bounds for a variety of systems in the GI GI and G G cases. We contribute end-to-end delay bounds for multi-stage fork-join networks. We perform a detailed comparison of different multi-server configurations, including an analysis of single-queue fork-join systems that achieve a fundamental performance gain. We compare these results to both simulation and a live Spark system.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Signalverarbeitung
- Informatik (insg.)
- Hardware und Architektur
- Informatik (insg.)
- Theoretische Informatik und Mathematik
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
in: IEEE Transactions on Parallel and Distributed Systems, Jahrgang 29, Nr. 7, 01.07.2018, S. 1545-1559.
Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review
}
TY - JOUR
T1 - Non-Asymptotic Delay Bounds for Multi-Server Systems with Synchronization Constraints
AU - Fidler, Markus
AU - Walker, Brenton
AU - Jiang, Yuming
N1 - Funding information: This work was supported in part by the European Research Council (ERC) under StG 306644.
PY - 2018/7/1
Y1 - 2018/7/1
N2 - Parallel computing has become a standard tool with architectures such as Google MapReduce, Hadoop, and Spark being broadly used in applications such as data processing and machine learning. Common to these systems are a fork operation, where jobs are first divided into tasks that are processed in parallel, and a join operation where completed tasks wait for the other tasks of the job before leaving the system. The synchronization constraint of the join operation makes the analysis of fork-join systems challenging, and few explicit results are known. In this work, we formulate a max-plus server model for parallel systems which allows us to derive performance bounds for a variety of systems in the GI GI and G G cases. We contribute end-to-end delay bounds for multi-stage fork-join networks. We perform a detailed comparison of different multi-server configurations, including an analysis of single-queue fork-join systems that achieve a fundamental performance gain. We compare these results to both simulation and a live Spark system.
AB - Parallel computing has become a standard tool with architectures such as Google MapReduce, Hadoop, and Spark being broadly used in applications such as data processing and machine learning. Common to these systems are a fork operation, where jobs are first divided into tasks that are processed in parallel, and a join operation where completed tasks wait for the other tasks of the job before leaving the system. The synchronization constraint of the join operation makes the analysis of fork-join systems challenging, and few explicit results are known. In this work, we formulate a max-plus server model for parallel systems which allows us to derive performance bounds for a variety of systems in the GI GI and G G cases. We contribute end-to-end delay bounds for multi-stage fork-join networks. We perform a detailed comparison of different multi-server configurations, including an analysis of single-queue fork-join systems that achieve a fundamental performance gain. We compare these results to both simulation and a live Spark system.
KW - Hadoop
KW - MapReduce
KW - Parallel computing
KW - performance analysis
KW - spark
KW - stochastic network calculus
UR - http://www.scopus.com/inward/record.url?scp=85040042465&partnerID=8YFLogxK
U2 - 10.48550/arXiv.1610.06309
DO - 10.48550/arXiv.1610.06309
M3 - Article
AN - SCOPUS:85040042465
VL - 29
SP - 1545
EP - 1559
JO - IEEE Transactions on Parallel and Distributed Systems
JF - IEEE Transactions on Parallel and Distributed Systems
SN - 1045-9219
IS - 7
ER -