個別の学習機構を持つ階層システムのためのアーキテクチャ
菊谷 彪文, 定本 知徳
pp. 289-299
DOI:
10.5687/iscie.35.289抄録
In this paper, we propose an architecture for realizing distributed reinforcement learning of distributed controllers for a class of unknown hierarchical systems, where homogeneous subsystems are interconnected through a complete graph. All these controllers consist of two sub-controllers for average and difference dynamics of the system, respectively. First, we show that optimal sub-controllers can be trained individually by a reinforcement learning (RL) method for average/difference data. Due to the smaller-scale of the data, the learning time of the proposed method can be drastically reduced compared to existing RL methods. However, the computation for obtaining the average data requires all-to-all communication among subsystems, which is undesirable in terms of communication costs and security. Hence, by exploiting a distributed consensus observer, we propose an architecture that enables us to learn distributed optimal controllers in a distributed manner. The control performance of the trained controller is shown to be ideally optimal. Moreover, the proposed architecture is completely scalable, i.e., its computational cost is independent from the number of subsystems. The effectiveness is shown through numerical simulations.