Details
Original language | English |
---|---|
Pages (from-to) | 16931–16943 |
Number of pages | 13 |
Journal | Neural Computing and Applications |
Volume | 35 |
Issue number | 23 |
Early online date | 5 Dec 2022 |
Publication status | Published - Aug 2023 |
Externally published | Yes |
Abstract
Reinforcement learning (RL) has become widely adopted in robot control. Despite many successes, one major persisting problem can be very low data efficiency. One solution is interactive feedback, which has been shown to speed up RL considerably. As a result, there is an abundance of different strategies, which are, however, primarily tested on discrete grid-world and small scale optimal control scenarios. In the literature, there is no consensus about which feedback frequency is optimal or at which time the feedback is most beneficial. To resolve these discrepancies we isolate and quantify the effect of feedback frequency in robotic tasks with continuous state and action spaces. The experiments encompass inverse kinematics learning for robotic manipulator arms of different complexity. We show that seemingly contradictory reported phenomena occur at different complexity levels. Furthermore, our results suggest that no single ideal feedback frequency exists. Rather that feedback frequency should be changed as the agent’s proficiency in the task increases.
Keywords
- Guided exploration, Human-aligned reinforcement learning, Interactive reinforcement learning, Intrinsic feedback homology
ASJC Scopus subject areas
- Computer Science(all)
- Software
- Computer Science(all)
- Artificial Intelligence
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Neural Computing and Applications, Vol. 35, No. 23, 08.2023, p. 16931–16943.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Quantifying the Effect of Feedback Frequency in Interactive Reinforcement Learning for Robotic Tasks
AU - Navarro-Guerrero, Nicolás
N1 - Publisher Copyright: © 2022, The Author(s).
PY - 2023/8
Y1 - 2023/8
N2 - Reinforcement learning (RL) has become widely adopted in robot control. Despite many successes, one major persisting problem can be very low data efficiency. One solution is interactive feedback, which has been shown to speed up RL considerably. As a result, there is an abundance of different strategies, which are, however, primarily tested on discrete grid-world and small scale optimal control scenarios. In the literature, there is no consensus about which feedback frequency is optimal or at which time the feedback is most beneficial. To resolve these discrepancies we isolate and quantify the effect of feedback frequency in robotic tasks with continuous state and action spaces. The experiments encompass inverse kinematics learning for robotic manipulator arms of different complexity. We show that seemingly contradictory reported phenomena occur at different complexity levels. Furthermore, our results suggest that no single ideal feedback frequency exists. Rather that feedback frequency should be changed as the agent’s proficiency in the task increases.
AB - Reinforcement learning (RL) has become widely adopted in robot control. Despite many successes, one major persisting problem can be very low data efficiency. One solution is interactive feedback, which has been shown to speed up RL considerably. As a result, there is an abundance of different strategies, which are, however, primarily tested on discrete grid-world and small scale optimal control scenarios. In the literature, there is no consensus about which feedback frequency is optimal or at which time the feedback is most beneficial. To resolve these discrepancies we isolate and quantify the effect of feedback frequency in robotic tasks with continuous state and action spaces. The experiments encompass inverse kinematics learning for robotic manipulator arms of different complexity. We show that seemingly contradictory reported phenomena occur at different complexity levels. Furthermore, our results suggest that no single ideal feedback frequency exists. Rather that feedback frequency should be changed as the agent’s proficiency in the task increases.
KW - Guided exploration
KW - Human-aligned reinforcement learning
KW - Interactive reinforcement learning
KW - Intrinsic feedback homology
UR - http://www.scopus.com/inward/record.url?scp=85143315822&partnerID=8YFLogxK
U2 - 10.1007/s00521-022-07949-0
DO - 10.1007/s00521-022-07949-0
M3 - Article
VL - 35
SP - 16931
EP - 16943
JO - Neural Computing and Applications
JF - Neural Computing and Applications
SN - 0941-0643
IS - 23
ER -