• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

¿µ¹® ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ¿µ¹® ³í¹®Áö > JIPS (Çѱ¹Á¤º¸Ã³¸®ÇÐȸ)

JIPS (Çѱ¹Á¤º¸Ã³¸®ÇÐȸ)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) Video Captioning with Visual and Semantic Features
¿µ¹®Á¦¸ñ(English Title) Video Captioning with Visual and Semantic Features
ÀúÀÚ(Author) Sujin Lee   Incheol Kim  
¿ø¹®¼ö·Ïó(Citation) VOL 14 NO. 06 PP. 1318 ~ 1330 (2018. 12)
Çѱ۳»¿ë
(Korean Abstract)
¿µ¹®³»¿ë
(English Abstract)
Video captioning refers to the process of extracting features from a video and generating video captions using the extracted features. This paper introduces a deep neural network model and its learning method for effective video captioning. In this study, visual features as well as semantic features, which effectively express the video, are also used. The visual features of the video are extracted using convolutional neural networks, such as C3D and ResNet, while the semantic features are extracted using a semantic feature extraction network proposed in this paper. Further, an attention-based caption generation network is proposed for effective generation of video captions using the extracted features. The performance and effectiveness of the proposed model is verified through various experiments using two large-scale video benchmarks such as the Microsoft Video Description (MSVD) and the Microsoft Research Video-To-Text (MSR-VTT).
Å°¿öµå(Keyword) Attention-Based Caption Generation   Deep Neural Networks   Semantic Feature   Video Captioning  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå