TIIS (Çѱ¹ÀÎÅͳÝÁ¤º¸ÇÐȸ)
Current Result Document : 4 / 5
ÇѱÛÁ¦¸ñ(Korean Title) |
PGA: An Efficient Adaptive Traffic Signal Timing Optimization Scheme Using ActorCritic Reinforcement Learning Algorithm |
¿µ¹®Á¦¸ñ(English Title) |
PGA: An Efficient Adaptive Traffic Signal Timing Optimization Scheme Using ActorCritic Reinforcement Learning Algorithm |
ÀúÀÚ(Author) |
Si Shen
Guojiang Shen
Yang Shen
Duanyang Liu
Xi Yang
Xiangjie Kong
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 14 NO. 11 PP. 4268 ~ 4289 (2020. 11) |
Çѱ۳»¿ë (Korean Abstract) |
|
¿µ¹®³»¿ë (English Abstract) |
Advanced traffic signal timing method plays very important role in reducing road congestion and air pollution. Reinforcement learning is considered as superior approach to build traffic light timing scheme by many recent studies. It fulfills real adaptive control by the means of taking real-time traffic information as state, and adjusting traffic light scheme as action. However, existing works behave inefficient in complex intersections and they are lack of feasibility because most of them adopt traffic light scheme whose phase sequence is flexible. To address these issues, a novel adaptive traffic signal timing scheme is proposed. It's based on actor-critic reinforcement learning algorithm, and advanced techniques proximal policy optimization and generalized advantage estimation are integrated. In particular, a new kind of reward function and a simplified form of state representation are carefully defined, and they facilitate to improve the learning efficiency and reduce the computational complexity, respectively. Meanwhile, a fixed phase sequence signal scheme is derived, and constraint on the variations of successive phase durations is introduced, which enhances its feasibility and robustness in field applications. The proposed scheme is verified through field-data-based experiments in both medium and high traffic density scenarios. Simulation results exhibit remarkable improvement in traffic performance as well as the learning efficiency comparing with the existing reinforcement learning-based methods such as 3DQN and DDQN.
|
Å°¿öµå(Keyword) |
Traffic signal timing
reinforcement learning
actor-critic
proximal policy optimization
generalized advantage estimation
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|