• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Current Result Document : 3 / 5 ÀÌÀü°Ç ÀÌÀü°Ç   ´ÙÀ½°Ç ´ÙÀ½°Ç

ÇѱÛÁ¦¸ñ(Korean Title) ½ÉÃþ °­È­ÇнÀ±â¹Ý ¿¬¼Ó»óÅ°ø°£ Á¦¾î¸¦ À§ÇÑ º¸»ó ÇÔ¼ö ºÐ¼®
¿µ¹®Á¦¸ñ(English Title) Analysis of Reward Functions in Deep Reinforcement Learning for Continuous State Space Control
ÀúÀÚ(Author) °­¹Î±¸   ±è±âÀÀ   MinKu Kang   Kee-Eung Kim  
¿ø¹®¼ö·Ïó(Citation) VOL 47 NO. 01 PP. 0078 ~ 0087 (2020. 01)
Çѱ۳»¿ë
(Korean Abstract)
¿¬¼Ó»óÅ°ø°£¿¡¼­ ÁÖ¾îÁø ŽºÅ©ÀÇ Á¦¾î¸¦ À§ÇØ ½ÉÃþ ½Å°æ¸ÁÀ» »ç¿ëÇÏ¿© °¡Ä¡ÇÔ¼ö¿Í Á¤Ã¥ÇÔ¼ö¸¦ ±Ù»çÇÏ´Â ½ÉÃþ °­È­ÇнÀ(Deep Reinforcement Learning) ¾Ë°í¸®ÁòÀº ÃÖ±Ù À¯¸ÁÇÑ °á°úµéÀ» º¸¿© ÁÖ¾ú´Ù. ±×·¯³ª ÇÔ¼ö±Ù»ç¸¦ À§ÇØ »ç¿ëµÇ´Â ½ÉÃþ ½Å°æ¸ÁÀÇ ºñ-ÄÁº¤½º Ư¼ºÀÌ ÃÖÀûÈ­ ¾Ë°í¸®ÁòÀÇ ÀÌ·ÐÀû ºÐ¼®À» Á¾Á¾ ¾î·Æ°Ô ¸¸µé¾î ¿ÔÀ¸¸ç ÀÌ·Î ÀÎÇÏ¿© ½ÉÃþ °­È­ÇнÀ ¾Ë°í¸®ÁòÀÇ Á¡±ÙÀû Àü¿ª ÃÖÀûÇØ·ÎÀÇ ¼ö·Å°ú °°Àº ÀÌ·ÐÀû º¸ÀåÀÌ ºÎÁ·ÇÏ´Ù. °­È­ÇнÀ¿¡¼­ÀÇ º¸»óÇÔ¼ö´Â ÇнÀ ¿¡ÀÌÀüÆ®ÀÇ ÀüüÀûÀΠƯ¼ºÀ» °áÁ¤Áþ´Â Áß¿äÇÑ ¿ä¼Ò Áß Çϳª¶ó´Â »ç½Ç¿¡ ±âÀÎÇÏ¿©, º» ³í¹®¿¡¼­´Â ½ÉÃþ °­È­ÇнÀ ¾Ë°í¸®ÁòÀÇ ºñ-ÄÁº¤½º ÃÖÀûÈ­ °úÁ¤ÀÇ ÀÌ·ÐÀû ¼ö·Å°ú °°Àº Ãø¸éº¸´Ù´Â ÀÛÁö¸¸ Áß¿äÇÑ Ãø¸é Áß Çϳª·Î½á, ½ÉÃþ°­È­ÇнÀ¿¡¼­ ³Î¸® »ç¿ëµÇ´Â º¸»óÇÔ¼öµéÀÇ ±¸Á¶¿Í À̵éÀÌ ÇнÀ ¾Ë°í¸®Áò¿¡ ¹ÌÄ¡´Â ¿µÇâ¿¡ ´ëÇØ ºÐ¼®ÇÑ´Ù. ½ÉÃþ °­È­ÇнÀ¿¡¼­ º¸»óÇÔ¼ö°¡ ÈçÈ÷ ½ÃÇàÂø¿À¹ý¿¡ ±â¹ÝÇÏ¿© ¼³°èµÇ¾î¿Â °ÍÀ» °í·ÁÇßÀ» ¶§, º» ³í¹®¿¡¼­ Á¦¾ÈÇÏ´Â ºÐ¼®ÀÌ ½ÉÃþ°­È­ÇнÀÀÇ º¸»ó ÇÔ¼ö ¼³°è¿¡ À¯¿ëÇÑ °¡À̵尡 µÉ ¼ö ÀÖÀ» °ÍÀ¸·Î ±â´ëÇÑ´Ù.
¿µ¹®³»¿ë
(English Abstract)
Deep Reinforcement Learning (DRL), which uses deep neural networks for the approximation of the value function and the policy, in continuous state-space control tasks has recently shown promising results. However, the use of deep neural networks as function approximators has often resulted in intractable analyses of DRL algorithms mainly due to their non-convexities and thus a lack of theoretical guarantee such as asymptotic global convergence of the learning algorithm. Considering the fact that the reward function in reinforcement learning is one of the key entities that determines the overall characteristics of the learning agents, we focused on a smaller but an important aspect of the analysis, investigating the structure of widely used reward functions in DRL tasks and their possible effects on the learning algorithm. The proposed analysis may facilitate identification of appropriate reward functions in DRL tasks, which has often been conducted via trial and error.
Å°¿öµå(Keyword) °­È­ÇнÀ   º¸»óÇÔ¼ö   ºñ-¸ðµ¨ °­È­ÇнÀ   º¸»óÇÔ¼ö ±¸Á¶   µ¥ÀÌÅÍ ±â¹Ý Á¦¾î   reinforcement learning   reward function   model-free reinforcement learning   reward structure   data-driven control  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå