Details
Original language | English |
---|---|
Journal | Frontiers in neurorobotics |
Volume | 11 |
Issue number | APR |
Publication status | Published - 3 Apr 2017 |
Externally published | Yes |
Abstract
Both nociception and punishment signals have been used in robotics. However, the potential for using these negatively valenced types of reinforcement learning signals for robot learning has not been exploited in detail yet. Nociceptive signals are primarily used as triggers of preprogrammed action sequences. Punishment signals are typically disembodied, i.e., with no or little relation to the agent-intrinsic limitations, and they are often used to impose behavioral constraints. Here, we provide an alternative approach for nociceptive signals as drivers of learning rather than simple triggers of preprogrammed behavior. Explicitly, we use nociception to expand the state space while we use punishment as a negative reinforcement learning signal. We compare the performance-in terms of task error, the amount of perceived nociception, and length of learned action sequences-of different neural networks imbued with punishment-based reinforcement signals for inverse kinematic learning. We contrast the performance of a version of the neural network that receives nociceptive inputs to that without such a process. Furthermore, we provide evidence that nociception can improve learning-making the algorithm more robust against network initializations-as well as behavioral performance by reducing the task error, perceived nociception, and length of learned action sequences. Moreover, we provide evidence that punishment, at least as typically used within reinforcement learning applications, may be detrimental in all relevant metrics.
Keywords
- Inverse kinematics, Nociception, Punishment, Reinforcement learning, Selfprotective mechanisms
ASJC Scopus subject areas
- Engineering(all)
- Biomedical Engineering
- Computer Science(all)
- Artificial Intelligence
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
In: Frontiers in neurorobotics, Vol. 11, No. APR, 03.04.2017.
Research output: Contribution to journal › Article › Research › peer review
}
TY - JOUR
T1 - Improving robot motor learning with negatively valenced reinforcement signals
AU - Navarro-Guerrero, Nicolás
AU - Lowe, Robert J.
AU - Wermter, Stefan
N1 - Publisher Copyright: © 2017 Navarro-Guerrero, Lowe and Wermter.
PY - 2017/4/3
Y1 - 2017/4/3
N2 - Both nociception and punishment signals have been used in robotics. However, the potential for using these negatively valenced types of reinforcement learning signals for robot learning has not been exploited in detail yet. Nociceptive signals are primarily used as triggers of preprogrammed action sequences. Punishment signals are typically disembodied, i.e., with no or little relation to the agent-intrinsic limitations, and they are often used to impose behavioral constraints. Here, we provide an alternative approach for nociceptive signals as drivers of learning rather than simple triggers of preprogrammed behavior. Explicitly, we use nociception to expand the state space while we use punishment as a negative reinforcement learning signal. We compare the performance-in terms of task error, the amount of perceived nociception, and length of learned action sequences-of different neural networks imbued with punishment-based reinforcement signals for inverse kinematic learning. We contrast the performance of a version of the neural network that receives nociceptive inputs to that without such a process. Furthermore, we provide evidence that nociception can improve learning-making the algorithm more robust against network initializations-as well as behavioral performance by reducing the task error, perceived nociception, and length of learned action sequences. Moreover, we provide evidence that punishment, at least as typically used within reinforcement learning applications, may be detrimental in all relevant metrics.
AB - Both nociception and punishment signals have been used in robotics. However, the potential for using these negatively valenced types of reinforcement learning signals for robot learning has not been exploited in detail yet. Nociceptive signals are primarily used as triggers of preprogrammed action sequences. Punishment signals are typically disembodied, i.e., with no or little relation to the agent-intrinsic limitations, and they are often used to impose behavioral constraints. Here, we provide an alternative approach for nociceptive signals as drivers of learning rather than simple triggers of preprogrammed behavior. Explicitly, we use nociception to expand the state space while we use punishment as a negative reinforcement learning signal. We compare the performance-in terms of task error, the amount of perceived nociception, and length of learned action sequences-of different neural networks imbued with punishment-based reinforcement signals for inverse kinematic learning. We contrast the performance of a version of the neural network that receives nociceptive inputs to that without such a process. Furthermore, we provide evidence that nociception can improve learning-making the algorithm more robust against network initializations-as well as behavioral performance by reducing the task error, perceived nociception, and length of learned action sequences. Moreover, we provide evidence that punishment, at least as typically used within reinforcement learning applications, may be detrimental in all relevant metrics.
KW - Inverse kinematics
KW - Nociception
KW - Punishment
KW - Reinforcement learning
KW - Selfprotective mechanisms
UR - http://www.scopus.com/inward/record.url?scp=85018457189&partnerID=8YFLogxK
U2 - 10.3389/fnbot.2017.00010
DO - 10.3389/fnbot.2017.00010
M3 - Article
AN - SCOPUS:85018457189
VL - 11
JO - Frontiers in neurorobotics
JF - Frontiers in neurorobotics
SN - 1662-5218
IS - APR
ER -