AI_Site

Comparative Evaluation of Reinforcement Learning with Scalar Rewards and Linear Regression with Multidimensional Feedback

65b86cc2939a5f4082f5cb51  ·  Petar Kormushev,Darwin G. Caldwell ·

This paper presents a comparative evaluation of two learning approaches. The first approach is a conventional reinforcement learning algorithm for direct policy search which uses scalar rewards by definition. The second approach is a custom linear regression based algorithm that uses multidimensional feedback instead of a scalar reward. The two approaches are evaluated in simulation on a common benchmark problem: an aiming task where the goal is to learn the optimal parameters for aiming that result in hitting as close as possible to a given target. The comparative evaluation shows that the multidimensional feedback provides a significant advantage over the scalar reward, resulting in an order-ofmagnitude speed-up of the convergence. A real-world experiment with a humanoid robot confirms the results from the simulation and highlights the importance of multidimensional feedback for fast learning.

Code


Tasks


Datasets


Problems


Methods


Results from the Paper