Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Proceedings of the 2024 IEEE Conference on Games, CoG 2024 |
Herausgeber (Verlag) | IEEE Computer Society |
ISBN (elektronisch) | 9798350350678 |
ISBN (Print) | 979-8-3503-5068-5 |
Publikationsstatus | Veröffentlicht - 5 Aug. 2024 |
Veranstaltung | 6th Annual IEEE Conference on Games, CoG 2024 - Milan, Italien Dauer: 5 Aug. 2024 → 8 Aug. 2024 |
Publikationsreihe
Name | IEEE Conference on Computatonal Intelligence and Games, CIG |
---|---|
ISSN (Print) | 2325-4270 |
ISSN (elektronisch) | 2325-4289 |
Abstract
One of the notorious issues for Reinforcement Learning (RL) is poor sample efficiency. Compared to single agent RL, the sample efficiency for Multi-Agent Reinforcement Learning (MARL) is more challenging because of its inherent partial observability, non-stationary training, and enormous strategy space. Although much effort has been devoted to developing new methods and enhancing sample efficiency, we look at the widely used episodic training mechanism. In each training step, tens of frames are collected, but only one gradient step is made. We argue that this episodic training could be a source of poor sample efficiency. To better exploit the data already collected, we propose to increase the frequency of the gradient updates per environment interaction (a.k.a. Replay Ratio or Update-To-Data ratio). To show its generality, we evaluate 3 MARL methods on 6 SMAC tasks. The empirical results validate that a higher replay ratio significantly improves the sample efficiency for MARL algorithms. The codes to reimplement the results presented in this paper are open-sourced at https://github.com/egg-west/rr-for-MARL.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Artificial intelligence
- Informatik (insg.)
- Computergrafik und computergestütztes Design
- Informatik (insg.)
- Maschinelles Sehen und Mustererkennung
- Informatik (insg.)
- Mensch-Maschine-Interaktion
- Informatik (insg.)
- Software
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Proceedings of the 2024 IEEE Conference on Games, CoG 2024. IEEE Computer Society, 2024. (IEEE Conference on Computatonal Intelligence and Games, CIG).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Higher Replay Ratio Empowers Sample-Efficient Multi-Agent Reinforcement Learning
AU - Xu, Linjie
AU - Liu, Zichuan
AU - Dockhorn, Alexander
AU - Perez-Liebana, Diego
AU - Wang, Jinyu
AU - Song, Lei
AU - Bian, Jiang
N1 - Publisher Copyright: © 2024 IEEE.
PY - 2024/8/5
Y1 - 2024/8/5
N2 - One of the notorious issues for Reinforcement Learning (RL) is poor sample efficiency. Compared to single agent RL, the sample efficiency for Multi-Agent Reinforcement Learning (MARL) is more challenging because of its inherent partial observability, non-stationary training, and enormous strategy space. Although much effort has been devoted to developing new methods and enhancing sample efficiency, we look at the widely used episodic training mechanism. In each training step, tens of frames are collected, but only one gradient step is made. We argue that this episodic training could be a source of poor sample efficiency. To better exploit the data already collected, we propose to increase the frequency of the gradient updates per environment interaction (a.k.a. Replay Ratio or Update-To-Data ratio). To show its generality, we evaluate 3 MARL methods on 6 SMAC tasks. The empirical results validate that a higher replay ratio significantly improves the sample efficiency for MARL algorithms. The codes to reimplement the results presented in this paper are open-sourced at https://github.com/egg-west/rr-for-MARL.
AB - One of the notorious issues for Reinforcement Learning (RL) is poor sample efficiency. Compared to single agent RL, the sample efficiency for Multi-Agent Reinforcement Learning (MARL) is more challenging because of its inherent partial observability, non-stationary training, and enormous strategy space. Although much effort has been devoted to developing new methods and enhancing sample efficiency, we look at the widely used episodic training mechanism. In each training step, tens of frames are collected, but only one gradient step is made. We argue that this episodic training could be a source of poor sample efficiency. To better exploit the data already collected, we propose to increase the frequency of the gradient updates per environment interaction (a.k.a. Replay Ratio or Update-To-Data ratio). To show its generality, we evaluate 3 MARL methods on 6 SMAC tasks. The empirical results validate that a higher replay ratio significantly improves the sample efficiency for MARL algorithms. The codes to reimplement the results presented in this paper are open-sourced at https://github.com/egg-west/rr-for-MARL.
KW - Multi-Agent Reinforcement Learning
KW - Reinforcement Learning
KW - Sample efficiency
KW - Starcraft II
UR - http://www.scopus.com/inward/record.url?scp=85203526586&partnerID=8YFLogxK
U2 - 10.48550/arXiv.2404.09715
DO - 10.48550/arXiv.2404.09715
M3 - Conference contribution
AN - SCOPUS:85203526586
SN - 979-8-3503-5068-5
T3 - IEEE Conference on Computatonal Intelligence and Games, CIG
BT - Proceedings of the 2024 IEEE Conference on Games, CoG 2024
PB - IEEE Computer Society
T2 - 6th Annual IEEE Conference on Games, CoG 2024
Y2 - 5 August 2024 through 8 August 2024
ER -