论文部分内容阅读
强化学习的收敛速度随状态-动作空间的维数呈指数增长,因此在涉及大的状态空间时,强化学习算法的收敛速度非常慢以至不能满足应用需求。在许多应用环境中,若智能体之间存在合作关系,借助多个智能体进行分布式学习可以部分解决这一问题。利用进化算法,设计了智能体繁殖、消亡等操作,使得子代智能体能够继承父代智能体在状态空间的方向信息,从而更快地找到状态-动作空间的有效更新。仿真实验表明:算法比已有的强化学习方法具有更高的搜索效率和收敛速度。
The convergence speed of reinforcement learning increases exponentially with the dimension of state-motion space, so the convergence speed of reinforcement learning algorithm is very slow and can not meet the application requirements when it involves large state space. In many application environments, where there is a partnership between agents, distributed learning with multiple agents can partially solve this problem. Using evolutionary algorithm, the operation of agent multiplication and extinction is designed, so that agent of future generation can inherit the direction information of agent in state space, so as to find the effective update of state-action space more quickly. Simulation results show that the algorithm has higher search efficiency and convergence speed than the existing reinforcement learning methods.