딥러닝 기반 피사체 시선 추적을 통한 자동 주석 생성 시스템

정지은; 최용석; Ji Eun Jeong; Yong Suk Choi

연구문헌

국내 논문지

홈 > 연구문헌 > 국내 논문지 > 한국정보과학회 논문지 > 정보과학회 컴퓨팅의 실제 논문지 (KIISE Transactions on Computing Practices)

정보과학회 컴퓨팅의 실제 논문지 (KIISE Transactions on Computing Practices)

Current Result Document :

한글제목(Korean Title)	딥러닝 기반 피사체 시선 추적을 통한 자동 주석 생성 시스템
영문제목(English Title)
저자(Author)	정지은 최용석 Ji Eun Jeong Yong Suk Choi
원문수록처(Citation)	VOL 27 NO. 03 PP. 0157 ~ 0162 (2021. 03)
한글내용 (Korean Abstract)	피사체의 시선 추적(Gaze Following)은 단일 이미지에서 피사체의 시선이 응시하는 지점을 탐지한다. 딥러닝 기반의 기존 연구는 단순히 시선의 각도를 추정하거나, 스마트폰과 같은 기기 스크린 내부 의 응시점을 추정하므로 어떤 객체를 보는지에 대한 정보를 얻을 수 없다는 한계가 있다. 본 논문에서는 최초로 딥러닝 모델을 활용하여 피사체의 시선을 추적하고 ‘J가 넥타이를 본다.’와 같이 자동으로 주석을 생성하는 시스템을 제안한다. 시스템은 전처리 모듈, 시선 추적 모듈, 후처리 모듈로 구성되며, 전처리 모듈 에서 인물을 인식하고 딥러닝 모델의 입력을 생성한다. 시선 추적 모듈에서는 딥러닝 모델이 피사체의 응시 지점이 표시된 히트맵(heatmap)을 생성한다. 후처리 모듈에서는 우리가 제안하는 객체 선택 알고리즘에 의해 응시 지점에 있는 객체를 인식하고 주석을 생성한다. 제안된 시스템은 리테일링 및 학술 목적의 대규모 메타데이터를 효율적으로 생성하는 데 활용될 수 있다.
영문내용 (English Abstract)	Gaze following is the task of detecting the point of attention in a single image at which a subject's gaze is staring. The existing deep learning methods have a limitation in that they cannot determine which specific object is the target of the gaze angle, because these methods simply estimate the gaze angle, or estimate the gaze point within the screen of a device, such as a smart phone. In this paper, we propose a novel system that infers where a subject is looking by using a deep learning model and automatically generating annotations such as 'J is looking at the tie.' The proposed system consists of a pre-processing module, a gaze following module, and a post-processing module. In the pre-processing module, our system recognizes faces and generates inputs for the deep learning model. In the gaze following module, the deep neural network generates a heatmap for the point at which the subject is looking. In the post-processing module, the proposed object selection algorithm determines the object of the gaze point, and annotations are then generated. The proposed system can be used to generate large-scale metadata for retailing and academic purposes.
키워드(Keyword)	딥러닝 시선 추적 시선 추정 주석 생성 deep learning gaze following gaze estimation annotation generation
파일첨부	PDF 다운로드