• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) ViStoryNet: ºñµð¿À ½ºÅ丮 ÀçÇöÀ» À§ÇÑ ¿¬¼Ó À̺¥Æ® ÀÓº£µù ¹× BiLSTM ±â¹Ý ½Å°æ¸Á
¿µ¹®Á¦¸ñ(English Title) ViStoryNet: Neural Networks with Successive Event Order Embedding and BiLSTMs for Video Story Regeneration
ÀúÀÚ(Author) Çã¹Î¿À   ±è°æ¹Î   À庴Ź   Min-Oh Heo   Kyung-Min Kim   Byoung-Tak Zhang  
¿ø¹®¼ö·Ïó(Citation) VOL 24 NO. 03 PP. 0138 ~ 0144 (2018. 03)
Çѱ۳»¿ë
(Korean Abstract)
º» °í¿¡¼­´Â ºñµð¿À·ÎºÎÅÍ coherent story¸¦ ÇнÀÇÏ¿© ºñµð¿À ½ºÅ丮¸¦ ÀçÇöÇÒ ¼ö ÀÖ´Â ½ºÅ丮 ÇнÀ/ÀçÇö ÇÁ·¹ÀÓ¿öÅ©¸¦ Á¦¾ÈÇÑ´Ù. À̸¦ À§ÇØ ¿¬¼Ó À̺¥Æ® ¼ø¼­¸¦ °¨µ¶ÇнÀ Á¤º¸·Î »ç¿ëÇÔÀ¸·Î½á °¢ ¿¡ÇǼҵåµéÀÌ Àº´Ð °ø°£ »ó¿¡¼­ ±ËÀû ÇüŸ¦ °¡Áöµµ·Ï À¯µµÇÏ¿©, ¼ø¼­Á¤º¸¿Í ÀǹÌÁ¤º¸¸¦ ÇÔ²² ´Ù·ê ¼ö ÀÖ´Â º¹ÇÕµÈ Ç¥Çö °ø°£À» ±¸ÃàÇÏ°íÀÚ ÇÑ´Ù. À̸¦ À§ÇØ À¯¾Æ¿ë ºñµð¿À ½Ã¸®Á ÇнÀµ¥ÀÌÅÍ·Î È°¿ëÇÏ¿´´Ù. ÀÌ´Â À̾߱⠱¸¼ºÀÇ Æ¯¼º, ³»·¯Æ¼ºê ¼ø¼­, º¹Àâµµ ¸é¿¡¼­ ¿©·¯ ÀåÁ¡ÀÌ ÀÖ´Ù. ¿©±â¿¡ ¿¬¼Ó À̺¥Æ® ÀÓº£µùÀ» ¹Ý¿µÇÑ ÀÎÄÚ´õ-µðÄÚ´õ ±¸Á¶¸¦ ±¸ÃàÇÏ°í, Àº´Ð °ø°£ »óÀÇ ½ÃÄö½ºÀÇ ¸ðµ¨¸µ¿¡ ¾ç¹æÇâ LSTMÀ» ÇнÀ½ÃÅ°µÇ ¿©·¯ ½ºÅÜÀÇ ¼­¿­ µ¥ÀÌÅÍ »ý¼ºÀ» °í·ÁÇÏ¿´´Ù. ¡®»Ç·Õ»Ç·Õ »Ç·Î·Î¡¯ ½Ã¸®Áî ºñµð¿À·ÎºÎÅÍ ÃßÃâµÈ ¾à 200 °³ÀÇ ¿¡ÇǼҵ带 ÀÌ¿ëÇÏ¿© ½ÇÇè°á°ú¸¦ º¸¿´´Ù. ½ÇÇèÀ» ÅëÇØ ¿¡ÇǼҵåµéÀÌ Àº´Ð°ø°£¿¡¼­ ±ËÀû ÇüŸ¦ °®´Â °Í°ú ÀϺΠť°¡ ÁÖ¾îÁ³À» ¶§ ½ºÅ丮¸¦ ÀçÇöÇÏ´Â ¹®Á¦¿¡ Àû¿ëÇÒ ¼ö ÀÖÀ½À» º¸¿´´Ù.
¿µ¹®³»¿ë
(English Abstract)
A video is a vivid medium similar to human¡¯s visual-linguistic experiences, since it can inculcate a sequence of situations, actions or dialogues that can be told as a story. In this study, we propose story learning/regeneration frameworks from videos with successive event order supervision for contextual coherence. The supervision induces each episode to have a form of trajectory in the latent space, which constructs a composite representation of ordering and semantics. In this study, we incorporated the use of kids videos as a training data. Some of the advantages associated with the kids videos include omnibus style, simple/explicit storyline in short, chronological narrative order, and relatively limited number of characters and spatial environments. We build the encoder-decoder structure with successive event order embedding, and train bi-directional LSTMs as sequence models considering multi-step sequence prediction. Using a series of approximately 200 episodes of kids videos named ¡®Pororo the Little Penguin¡¯, we give empirical results for story regeneration tasks and SEOE. In addition, each episode shows a trajectory-like shape on the latent space of the model, which gives the geometric information for the sequence models.
Å°¿öµå(Keyword) ºñµð¿À ½ºÅ丮 ÇнÀ   ºñµð¿À ½ºÅ丮 ÀçÇö   ¿¬¼Ó À̺¥Æ® ÀÓº£µù   À¯¾Æ¿ë ºñµð¿À µ¥ÀÌÅÍÁýÇÕ   video story learning   video story regeneration   successive event order embedding   kids video dataset  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå