Improving robot motor learning with negatively valenced reinforcement signals

Nicolás Navarro-Guerrero; Robert J. Lowe; Stefan Wermter

doi:10.3389/fnbot.2017.00010

Details

Originalsprache	Englisch
Fachzeitschrift	Frontiers in neurorobotics
Jahrgang	11
Ausgabenummer	APR
Publikationsstatus	Veröffentlicht - 3 Apr. 2017
Extern publiziert	Ja

Abstract

Both nociception and punishment signals have been used in robotics. However, the potential for using these negatively valenced types of reinforcement learning signals for robot learning has not been exploited in detail yet. Nociceptive signals are primarily used as triggers of preprogrammed action sequences. Punishment signals are typically disembodied, i.e., with no or little relation to the agent-intrinsic limitations, and they are often used to impose behavioral constraints. Here, we provide an alternative approach for nociceptive signals as drivers of learning rather than simple triggers of preprogrammed behavior. Explicitly, we use nociception to expand the state space while we use punishment as a negative reinforcement learning signal. We compare the performance-in terms of task error, the amount of perceived nociception, and length of learned action sequences-of different neural networks imbued with punishment-based reinforcement signals for inverse kinematic learning. We contrast the performance of a version of the neural network that receives nociceptive inputs to that without such a process. Furthermore, we provide evidence that nociception can improve learning-making the algorithm more robust against network initializations-as well as behavioral performance by reducing the task error, perceived nociception, and length of learned action sequences. Moreover, we provide evidence that punishment, at least as typically used within reinforcement learning applications, may be detrimental in all relevant metrics.

ASJC Scopus Sachgebiete

Ingenieurwesen (insg.)
Biomedizintechnik
Informatik (insg.)
Artificial intelligence

Zitieren

Improving robot motor learning with negatively valenced reinforcement signals. / Navarro-Guerrero, Nicolás; Lowe, Robert J.; Wermter, Stefan.
in: Frontiers in neurorobotics, Jahrgang 11, Nr. APR, 03.04.2017.

Publikation: Beitrag in Fachzeitschrift › Artikel › Forschung › Peer-Review

Navarro-Guerrero, N, Lowe, RJ & Wermter, S 2017, 'Improving robot motor learning with negatively valenced reinforcement signals', Frontiers in neurorobotics, Jg. 11, Nr. APR. https://doi.org/10.3389/fnbot.2017.00010

Navarro-Guerrero, N., Lowe, R. J., & Wermter, S. (2017). Improving robot motor learning with negatively valenced reinforcement signals. Frontiers in neurorobotics, 11(APR). https://doi.org/10.3389/fnbot.2017.00010

Navarro-Guerrero N, Lowe RJ, Wermter S. Improving robot motor learning with negatively valenced reinforcement signals. Frontiers in neurorobotics. 2017 Apr 3;11(APR). doi: 10.3389/fnbot.2017.00010

Navarro-Guerrero, Nicolás ; Lowe, Robert J. ; Wermter, Stefan. / Improving robot motor learning with negatively valenced reinforcement signals. in: Frontiers in neurorobotics. 2017 ; Jahrgang 11, Nr. APR.

Download

@article{a00e6367b63f46d7b3561b366383452f,

title = "Improving robot motor learning with negatively valenced reinforcement signals",

abstract = "Both nociception and punishment signals have been used in robotics. However, the potential for using these negatively valenced types of reinforcement learning signals for robot learning has not been exploited in detail yet. Nociceptive signals are primarily used as triggers of preprogrammed action sequences. Punishment signals are typically disembodied, i.e., with no or little relation to the agent-intrinsic limitations, and they are often used to impose behavioral constraints. Here, we provide an alternative approach for nociceptive signals as drivers of learning rather than simple triggers of preprogrammed behavior. Explicitly, we use nociception to expand the state space while we use punishment as a negative reinforcement learning signal. We compare the performance-in terms of task error, the amount of perceived nociception, and length of learned action sequences-of different neural networks imbued with punishment-based reinforcement signals for inverse kinematic learning. We contrast the performance of a version of the neural network that receives nociceptive inputs to that without such a process. Furthermore, we provide evidence that nociception can improve learning-making the algorithm more robust against network initializations-as well as behavioral performance by reducing the task error, perceived nociception, and length of learned action sequences. Moreover, we provide evidence that punishment, at least as typically used within reinforcement learning applications, may be detrimental in all relevant metrics.",

keywords = "Inverse kinematics, Nociception, Punishment, Reinforcement learning, Selfprotective mechanisms",

author = "Nicol{\'a}s Navarro-Guerrero and Lowe, {Robert J.} and Stefan Wermter",

note = "Publisher Copyright: {\textcopyright} 2017 Navarro-Guerrero, Lowe and Wermter.",

year = "2017",

month = apr,

day = "3",

doi = "10.3389/fnbot.2017.00010",

language = "English",

volume = "11",

journal = "Frontiers in neurorobotics",

issn = "1662-5218",

publisher = "Frontiers Media S.A.",

number = "APR",

}

Download

TY - JOUR

T1 - Improving robot motor learning with negatively valenced reinforcement signals

AU - Navarro-Guerrero, Nicolás

AU - Lowe, Robert J.

AU - Wermter, Stefan

PY - 2017/4/3

Y1 - 2017/4/3

N2 - Both nociception and punishment signals have been used in robotics. However, the potential for using these negatively valenced types of reinforcement learning signals for robot learning has not been exploited in detail yet. Nociceptive signals are primarily used as triggers of preprogrammed action sequences. Punishment signals are typically disembodied, i.e., with no or little relation to the agent-intrinsic limitations, and they are often used to impose behavioral constraints. Here, we provide an alternative approach for nociceptive signals as drivers of learning rather than simple triggers of preprogrammed behavior. Explicitly, we use nociception to expand the state space while we use punishment as a negative reinforcement learning signal. We compare the performance-in terms of task error, the amount of perceived nociception, and length of learned action sequences-of different neural networks imbued with punishment-based reinforcement signals for inverse kinematic learning. We contrast the performance of a version of the neural network that receives nociceptive inputs to that without such a process. Furthermore, we provide evidence that nociception can improve learning-making the algorithm more robust against network initializations-as well as behavioral performance by reducing the task error, perceived nociception, and length of learned action sequences. Moreover, we provide evidence that punishment, at least as typically used within reinforcement learning applications, may be detrimental in all relevant metrics.

AB - Both nociception and punishment signals have been used in robotics. However, the potential for using these negatively valenced types of reinforcement learning signals for robot learning has not been exploited in detail yet. Nociceptive signals are primarily used as triggers of preprogrammed action sequences. Punishment signals are typically disembodied, i.e., with no or little relation to the agent-intrinsic limitations, and they are often used to impose behavioral constraints. Here, we provide an alternative approach for nociceptive signals as drivers of learning rather than simple triggers of preprogrammed behavior. Explicitly, we use nociception to expand the state space while we use punishment as a negative reinforcement learning signal. We compare the performance-in terms of task error, the amount of perceived nociception, and length of learned action sequences-of different neural networks imbued with punishment-based reinforcement signals for inverse kinematic learning. We contrast the performance of a version of the neural network that receives nociceptive inputs to that without such a process. Furthermore, we provide evidence that nociception can improve learning-making the algorithm more robust against network initializations-as well as behavioral performance by reducing the task error, perceived nociception, and length of learned action sequences. Moreover, we provide evidence that punishment, at least as typically used within reinforcement learning applications, may be detrimental in all relevant metrics.

KW - Inverse kinematics

KW - Nociception

KW - Punishment

KW - Reinforcement learning

KW - Selfprotective mechanisms

UR - http://www.scopus.com/inward/record.url?scp=85018457189&partnerID=8YFLogxK

U2 - 10.3389/fnbot.2017.00010

DO - 10.3389/fnbot.2017.00010

M3 - Article

AN - SCOPUS:85018457189

VL - 11

JO - Frontiers in neurorobotics

JF - Frontiers in neurorobotics

SN - 1662-5218

IS - APR

ER -

Research@Leibniz University

Improving robot motor learning with negatively valenced reinforcement signals

Autorschaft

Externe Organisationen

Details

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Continual Domain Randomization

Optimizing BioTac Simulation for Realistic Tactile Perception

Cognitive inspired aspects of robot learning

Quantifying the Effect of Feedback Frequency in Interactive Reinforcement Learning for Robotic Tasks

Visuo-haptic object perception for robots: an overview