论文部分内容阅读
交叉口车辆排放较为复杂,尤其是在考虑初始排队长度的情况下,更是难以建立明确的数学模型。Q学习是一种无模型的强化学习算法,通过与环境的试错交互学习最优控制策略。本文提出了一种基于Q学习的交通排放信号控制方案。利用仿真平台USTCMTS2.0,通过不断地试错学习找到在不同相位排队长度下最优配时。在Q学习中添加了模糊初始化Q函数的方法以改进Q学习的收敛速度,加速了学习过程。仿真实验结果表明:强化学习算法取得较好的效果。相比较Hideki的方法,在车流量较高时,车辆平均排放量减少了13.9%,并且对Q函数值的模糊初始化大大加速了Q函数收敛的过程。
Vehicle emissions at the intersections are more complicated, especially when considering the initial queue length, it is even more difficult to establish a clear mathematical model. Q learning is a modelless reinforcement learning algorithm that learns the optimal control strategy by interacting with the environment in trial and error. This paper presents a Q-based traffic emission control scheme. Using the simulation platform USTCMTS2.0, through continuous trial and error learning to find the optimal timing in different phase queuing length. In the Q learning, fuzzy initialization Q function is added to improve the Q learning convergence speed and speed up the learning process. The simulation results show that the reinforcement learning algorithm achieves good results. Compared with Hideki’s method, average vehicle emissions are reduced by 13.9% at higher vehicle traffic levels, and the fuzzy initialization of Q-function values greatly accelerates the Q-function convergence.