正則化理論を用いた連続的状態と行動を扱う強化学習
深尾 隆則, 稲山 典克, 足立 紀彦
pp. 593-599
DOI:
10.5687/iscie.11.593抄録
Reinforcement learning is to learn how to act optimally in an unknown environment. It requires only a scalar reinforcement signal as performance feedback from the environment. Q-learning is one of the famous algorithms for the reinforcement learning. This paper presents a new method that is able to treat the continuous states and actions in the Q-learning. That is because a Q-function is smoothly approximated by using regularization theory.