Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | X10 2015: Proceedings of the ACM SIGPLAN Workshop on X10 |
Herausgeber/-innen | Jose Nelson Amaral, Olivier Tardieu |
Seiten | 13-18 |
Seitenumfang | 6 |
ISBN (elektronisch) | 9781450335867 |
Publikationsstatus | Veröffentlicht - Juni 2015 |
Extern publiziert | Ja |
Veranstaltung | 5th ACM SIGPLAN Workshop on X10, X10 2015 - Portland, USA / Vereinigte Staaten Dauer: 14 Juni 2015 → … |
Abstract
In the X10 language, computations are modeled as lightweight threads called activities. Since most operating systems only offer relatively heavyweight kernel-level threads, the X10 runtime system implements a user-space scheduler to map activities to operating-system threads in a many-to-one fashion. This approach can lead to suboptimal scheduling decisions or synchronization overhead. In this paper, we present an alternative X10 runtime system that targets OctoPOS, an operating system designed from the ground up for highly parallel workloads on PGAS architectures. OctoPOS offers an unconventional execution model based on i-lets, lightweight self-contained units of computation with (mostly) runto- completion semantics that can be dispatched very efficiently.We are able to do a 1-to-1 mapping of X10 activities to i-lets, which results in a slim runtime system, avoiding the need for user-level scheduling and its costs. We perform microbenchmarks on a prototype many-core hardware architecture and show that our system needs fewer than 2000 clock cycles to spawn local and remote activities.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Computergrafik und computergestütztes Design
- Informatik (insg.)
- Maschinelles Sehen und Mustererkennung
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
X10 2015: Proceedings of the ACM SIGPLAN Workshop on X10. Hrsg. / Jose Nelson Amaral; Olivier Tardieu. 2015. S. 13-18.
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Cutting Out the Middleman: OS-Level Support for X10 Activities
AU - Mohr, Manuel
AU - Buchwald, Sebastian
AU - Zwinkau, Andreas
AU - Erhardt, Christoph
AU - Oechslein, Benjamin
AU - Schedel, Jens
AU - Lohmann, Daniel
PY - 2015/6
Y1 - 2015/6
N2 - In the X10 language, computations are modeled as lightweight threads called activities. Since most operating systems only offer relatively heavyweight kernel-level threads, the X10 runtime system implements a user-space scheduler to map activities to operating-system threads in a many-to-one fashion. This approach can lead to suboptimal scheduling decisions or synchronization overhead. In this paper, we present an alternative X10 runtime system that targets OctoPOS, an operating system designed from the ground up for highly parallel workloads on PGAS architectures. OctoPOS offers an unconventional execution model based on i-lets, lightweight self-contained units of computation with (mostly) runto- completion semantics that can be dispatched very efficiently.We are able to do a 1-to-1 mapping of X10 activities to i-lets, which results in a slim runtime system, avoiding the need for user-level scheduling and its costs. We perform microbenchmarks on a prototype many-core hardware architecture and show that our system needs fewer than 2000 clock cycles to spawn local and remote activities.
AB - In the X10 language, computations are modeled as lightweight threads called activities. Since most operating systems only offer relatively heavyweight kernel-level threads, the X10 runtime system implements a user-space scheduler to map activities to operating-system threads in a many-to-one fashion. This approach can lead to suboptimal scheduling decisions or synchronization overhead. In this paper, we present an alternative X10 runtime system that targets OctoPOS, an operating system designed from the ground up for highly parallel workloads on PGAS architectures. OctoPOS offers an unconventional execution model based on i-lets, lightweight self-contained units of computation with (mostly) runto- completion semantics that can be dispatched very efficiently.We are able to do a 1-to-1 mapping of X10 activities to i-lets, which results in a slim runtime system, avoiding the need for user-level scheduling and its costs. We perform microbenchmarks on a prototype many-core hardware architecture and show that our system needs fewer than 2000 clock cycles to spawn local and remote activities.
KW - Invasive Computing
KW - Many-Core Architecture
KW - Operating System
KW - Runtime System
KW - Scheduling
KW - X10
UR - http://www.scopus.com/inward/record.url?scp=84982105822&partnerID=8YFLogxK
U2 - 10.1145/2771774.2771775
DO - 10.1145/2771774.2771775
M3 - Conference contribution
AN - SCOPUS:84982105822
SP - 13
EP - 18
BT - X10 2015: Proceedings of the ACM SIGPLAN Workshop on X10
A2 - Amaral, Jose Nelson
A2 - Tardieu, Olivier
T2 - 5th ACM SIGPLAN Workshop on X10, X10 2015
Y2 - 14 June 2015
ER -