文档介绍:COLLABORATIVE Q(λ) REINFORCEMENT LEARNING ALGORITHM - A
PROMISING ROBOT LEARNING FRAMEWORK
Uri Kartoun*, Helman Stern*, Yael Edan*, Craig Feied**, Jonathan Handler**, Mark Smith**, Michael Gillam**
*Department of Industrial Engineering and Management, Ben-Gurion University of the Negev
Be’er-Sheeva, 84105, Israel
{kartoun, helman, yael}***@
**Institute for Medical Informatics, Washington Hospital Center
110 Irving St., Washington DC, NW, 20010, .
{cfeied}@
ABSTRACT some dramatic condition - ., plishment of a
This paper presents the design and implementation of a subtask (reward) plete failure (punishment), and
new reinforcement learning (RL) based algorithm. The the learner’s goal is to optimize its behavior based on
proposed algorithm, CQ(λ) (collaborative Q(λ) ) allows some performance measure (maximization of a cost
several learning agents to acquire knowledge from each function). The learning agent learns the associations
other. Acquiring knowledge learnt by an agent via between observed states and chosen actions that lead to
collaboration with another agent enables acceleration of rewards or punishments, ., it learns how to assign credit
the entire learning system; therefore, learning can be to past actions and states by correctly estimating costs
utilized more efficiently. By developing col