Details
Original language | English |
---|---|
Title of host publication | 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 148-155 |
Number of pages | 8 |
ISBN (electronic) | 9781538637159 |
Publication status | Published - 2 Jul 2017 |
Externally published | Yes |
Event | 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017 - Lisbon, Portugal Duration: 18 Sept 2017 → 21 Sept 2017 |
Abstract
Reinforcement learning algorithms and particularly those based on temporal-difference learning are widely adopted and have been successfully applied to a number of problems as well as used to model animal learning. However, they are based on neural pathways involved in reward-seeking behaviour since little is known about punishment-driven learning and less still about the combined effects of both types of reinforcement on learning. This may not only be a shortcoming for computational models of human and animal learning but we have recently shown that it may also carry detrimental effects for machine learning applications, with respect to task performance and convergence speed. Here, we further explore our original results and compare the effects of different functions, i.e. binary, linear, exponential with different variance, for punishment on learning. Our experiments confirm the original finding of punishment signals reducing learning speed. It appears this result generalizes across a number of different functions of punishment reinforcement.
ASJC Scopus subject areas
- Computer Science(all)
- Artificial Intelligence
- Engineering(all)
- Mechanical Engineering
- Mathematics(all)
- Control and Optimization
- Neuroscience(all)
- Developmental Neuroscience
Cite this
- Standard
- Harvard
- Apa
- Vancouver
- BibTeX
- RIS
7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017. Institute of Electrical and Electronics Engineers Inc., 2017. p. 148-155.
Research output: Chapter in book/report/conference proceeding › Conference contribution › Research › peer review
}
TY - GEN
T1 - The effects on adaptive behaviour of negatively valenced signals in reinforcement learning
AU - Navarro-Guerrero, Nicolas
AU - Lowe, Robert J.
AU - Wermter, Stefan
N1 - Publisher Copyright: © 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - Reinforcement learning algorithms and particularly those based on temporal-difference learning are widely adopted and have been successfully applied to a number of problems as well as used to model animal learning. However, they are based on neural pathways involved in reward-seeking behaviour since little is known about punishment-driven learning and less still about the combined effects of both types of reinforcement on learning. This may not only be a shortcoming for computational models of human and animal learning but we have recently shown that it may also carry detrimental effects for machine learning applications, with respect to task performance and convergence speed. Here, we further explore our original results and compare the effects of different functions, i.e. binary, linear, exponential with different variance, for punishment on learning. Our experiments confirm the original finding of punishment signals reducing learning speed. It appears this result generalizes across a number of different functions of punishment reinforcement.
AB - Reinforcement learning algorithms and particularly those based on temporal-difference learning are widely adopted and have been successfully applied to a number of problems as well as used to model animal learning. However, they are based on neural pathways involved in reward-seeking behaviour since little is known about punishment-driven learning and less still about the combined effects of both types of reinforcement on learning. This may not only be a shortcoming for computational models of human and animal learning but we have recently shown that it may also carry detrimental effects for machine learning applications, with respect to task performance and convergence speed. Here, we further explore our original results and compare the effects of different functions, i.e. binary, linear, exponential with different variance, for punishment on learning. Our experiments confirm the original finding of punishment signals reducing learning speed. It appears this result generalizes across a number of different functions of punishment reinforcement.
UR - http://www.scopus.com/inward/record.url?scp=85050366417&partnerID=8YFLogxK
U2 - 10.1109/devlrn.2017.8329800
DO - 10.1109/devlrn.2017.8329800
M3 - Conference contribution
AN - SCOPUS:85050366417
SP - 148
EP - 155
BT - 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 7th Joint IEEE International Conference on Development and Learning and on Epigenetic Robotics, ICDL-EpiRob 2017
Y2 - 18 September 2017 through 21 September 2017
ER -