효과적인 문서 수준의 정보를 이용한 합성곱 신경망 기반의 신규성 탐지

조성웅; 오흥선; 임상훈; 김선호; Seongung Jo; Heung-Seon Oh; Sanghun Im; Seonho Kim

연구문헌

국내 논문지

홈 > 연구문헌 > 국내 논문지 > 한국정보처리학회 논문지 > 정보처리학회 논문지 컴퓨터 및 통신시스템

정보처리학회 논문지 컴퓨터 및 통신시스템

Current Result Document : 109 / 288 이전건 다음건

한글제목(Korean Title)	효과적인 문서 수준의 정보를 이용한 합성곱 신경망 기반의 신규성 탐지
영문제목(English Title)	CNN-Based Novelty Detection with Effectively Incorporating Document-Level Information
저자(Author)	조성웅 오흥선 임상훈 김선호 Seongung Jo Heung-Seon Oh Sanghun Im Seonho Kim
원문수록처(Citation)	VOL 09 NO. 10 PP. 0231 ~ 0238 (2020. 10)
한글내용 (Korean Abstract)	웹 상에 수 많은 문서가 등장함에 따라 기존 문서와 내용이 중복되는 문서를 찾아서 제외함으로써 새로운 문서를 찾는 노력을 줄일 수 있어 문서 수준의 신규성 탐지(novelty detection)가 중요해졌다. 최근 연구에서는 합성곱 신경망(CNN) 구조 기반의 신규성 탐지 모델 구조가 제안되었고 상당한 성능 향상을 나타내였다. 본 논문에서는 기존의 CNN 기반의 모델에서 문서 수준의 정보가 제한적으로 사용되는 것을 관측하고 문서의 신규성을 결정할 때 문서 수준의 정보가 중요하므로 제한적인 사용이 문제가 된다고 가정하였다. 이에 대한 해결책으로, 본 논문에서는 합성곱 신경망 기반 신규성 탐지 모델 구조를 개선하여 문서 수준 정보를 효과적으로 사용하는 두 가지 방법을 제안한다. 본 논문에서 제안하는 방법은 대상(target) 문서와 증거로 주어진 출처(source) 문서 사이의 상대적(relative) 정보를 추출하여 신규성을 분류할 대상 문서의 특징 벡터를 구성하는 것에 초점을 맞춘다. 본 논문에서는 표준 벤치마크 데이터 셋인 TAP-DLND 1.0를 이용하여 여러 실험을 통해서 제안한 방법의 우수성을 보여준다
영문내용 (English Abstract)	With a large number of documents appearing on the web, document-level novelty detection has become important since it can reduce the efforts of finding novel documents by discarding documents sharing redundant information already seen. A recent work proposed a convolutional neural network (CNN)-based novelty detection model with significant performance improvements. We observed that it has a restriction of using document-level information in determining novelty but assumed that the document-level information is more important. As a solution, this paper proposed two methods of effectively incorporating document-level information using a CNN-based novelty detection model. Our methods focus on constructing a feature vector of a target document to be classified by extracting relative information between the target document and source documents given as evidence. A series of experiments showed the superiority of our methods on a standard benchmark collection, TAP-DLND 1.0.
키워드(Keyword)	딥 러닝 합성곱 신경망 신규성 탐지 Deep Learning CNN Novelty Detection
파일첨부	PDF 다운로드