1차원 CNN를 이용한 비디오 행동 인식 모델 병렬성 향상

성수진; 차정원; Su-Jin Seong; Jeong-Won Cha

연구문헌

국내 논문지

홈 > 연구문헌 > 국내 논문지 > 한국정보과학회 논문지 > 정보과학회 컴퓨팅의 실제 논문지 (KIISE Transactions on Computing Practices)

정보과학회 컴퓨팅의 실제 논문지 (KIISE Transactions on Computing Practices)

Current Result Document : 575 / 575

한글제목(Korean Title)	1차원 CNN를 이용한 비디오 행동 인식 모델 병렬성 향상
영문제목(English Title)	Improving Parallelism for Video Action Recognition Model Using One-dimensional Convolutional Neural Network
저자(Author)	성수진 차정원 Su-Jin Seong Jeong-Won Cha
원문수록처(Citation)	VOL 27 NO. 04 PP. 0216 ~ 0220 (2021. 04)
한글내용 (Korean Abstract)	딥러닝 프레임워크는 컴퓨터 비전 많은 분야에서 괄목할 만한 성과를 보여주고 있다. 비디오 행동 인식 분야 역시 딥러닝 모델을 적용하기 위한 많은 연구들이 수행되었다. 한 선행연구는 2차원 CNN 을 이용해 공간적 피쳐를 학습하고 이를 RNN에 입력으로 전달해 이용해 공간적 피쳐 사이의 시간적 상호 관계를 학습하는 모델 구조를 제안했다. 본 논문에서는 RNN 대신 1차원 CNN을 이용해 시간적 상호 관계를 학습하도록 선행 연구의 모델 구조를 개선하는 연구를 수행한다. 이러한 구조 변경을 통해 RNN의 순차적 연산 과정을 제거해 향상된 GPU 활용도를 기대할 수 있다. 본 논문은 수정된 모델이 정확도를 비슷하게 유지하면서 연산 시간이 줄어드는 것을 보여주는 실험 결과를 제시함으로써 이러한 주장을 뒷받침 한다.
영문내용 (English Abstract)	The deep learning framework has shown remarkable results on numerous computer vision tasks. Many studies have been performed for video action recognition tasks to apply deep learning models to the task. One of the previous works suggested the model architecture, where spatial features are learned from 2D Convolutional Neural Networks (CNNs) and then passed to Recurrent Neural Networks (RNNs) to learn about temporal dependency among them. In this paper, we study the improved model architecture where the temporal relationship of spatial features is processed with 1D CNN instead of RNN. From this modification, we can expect better utilization of GPU by removing sequential operations of RNN. We support the argument based on the experiment results that show that it leads to the reduction in computation time and maintains a similar classification accuracy.
키워드(Keyword)	트랜스포머 인코더-디코더 모델 자동 제목 생성 단어 손실함수 반복 페널티 transformer encoder-decoder automatic title generation word loss repeat penalty 비디오 분류 비디오 행동 인식 1차원 CNN 딥러닝 video classification video action recognition 1D convolutional neural network deep learning
파일첨부	PDF 다운로드