PGA: An Efficient Adaptive Traffic Signal Timing Optimization Scheme Using ActorCritic Reinforcement Learning Algorithm

Si Shen; Guojiang Shen; Yang Shen; Duanyang Liu; Xi Yang; Xiangjie Kong

연구문헌

영문 논문지

홈 > 연구문헌 > 영문 논문지 > TIIS (한국인터넷정보학회)

TIIS (한국인터넷정보학회)

Current Result Document : 4 / 5

한글제목(Korean Title)	PGA: An Efficient Adaptive Traffic Signal Timing Optimization Scheme Using ActorCritic Reinforcement Learning Algorithm
영문제목(English Title)	PGA: An Efficient Adaptive Traffic Signal Timing Optimization Scheme Using ActorCritic Reinforcement Learning Algorithm
저자(Author)	Si Shen Guojiang Shen Yang Shen Duanyang Liu Xi Yang Xiangjie Kong
원문수록처(Citation)	VOL 14 NO. 11 PP. 4268 ~ 4289 (2020. 11)
한글내용 (Korean Abstract)
영문내용 (English Abstract)	Advanced traffic signal timing method plays very important role in reducing road congestion and air pollution. Reinforcement learning is considered as superior approach to build traffic light timing scheme by many recent studies. It fulfills real adaptive control by the means of taking real-time traffic information as state, and adjusting traffic light scheme as action. However, existing works behave inefficient in complex intersections and they are lack of feasibility because most of them adopt traffic light scheme whose phase sequence is flexible. To address these issues, a novel adaptive traffic signal timing scheme is proposed. It's based on actor-critic reinforcement learning algorithm, and advanced techniques proximal policy optimization and generalized advantage estimation are integrated. In particular, a new kind of reward function and a simplified form of state representation are carefully defined, and they facilitate to improve the learning efficiency and reduce the computational complexity, respectively. Meanwhile, a fixed phase sequence signal scheme is derived, and constraint on the variations of successive phase durations is introduced, which enhances its feasibility and robustness in field applications. The proposed scheme is verified through field-data-based experiments in both medium and high traffic density scenarios. Simulation results exhibit remarkable improvement in traffic performance as well as the learning efficiency comparing with the existing reinforcement learning-based methods such as 3DQN and DDQN.
키워드(Keyword)	Traffic signal timing reinforcement learning actor-critic proximal policy optimization generalized advantage estimation
파일첨부	PDF 다운로드