파라메트릭 활성함수를 이용한 기울기 소실 문제의 완화

김태진; 김희찬; 이수원; Taejin Kim; Heechan Kim; Soowon Lee; 고영민; 고선우; Young Min Ko; Sun Woo Ko

연구문헌

국내 논문지

홈 > 연구문헌 > 국내 논문지 > 한국정보처리학회 논문지 > 정보처리학회 논문지 소프트웨어 및 데이터 공학

정보처리학회 논문지 소프트웨어 및 데이터 공학

Current Result Document : 579 / 579

한글제목(Korean Title)	파라메트릭 활성함수를 이용한 기울기 소실 문제의 완화
영문제목(English Title)	Alleviation of Vanishing Gradient Problem Using Parametric Activation Functions
저자(Author)	김태진 김희찬 이수원 Taejin Kim Heechan Kim Soowon Lee 고영민 고선우 Young Min Ko Sun Woo Ko
원문수록처(Citation)	VOL 10 NO. 10 PP. 0407 ~ 0420 (2021. 10)
한글내용 (Korean Abstract)	심층신경망은 다양한 문제를 해결하는데 널리 사용되고 있다. 하지만 은닉층이 깊은 심층신경망을 학습하는 동안 빈번히 발생하는 기울기 소실 또는 폭주 문제는 심층신경망 학습의 큰 걸림돌이 되고 있다. 본 연구에서는 기울기 소실이 발생하는 원인 중 비선형활성함수에 의해 발생할 수 있는 기울기 소실 문제를 완화하기 위해 파라메트릭 활성함수를 제안한다. 제안된 파라메트릭 활성함수는 입력 데이터의 특성에 따라 활성함수의 크기 및 위치를 변환시킬 수 있는 파라미터를 적용하여 얻을 수 있으며 역전파과정을 통해 활성함수의 미분 크기에 제한이 없는 손실함수를 최소화되도록 학습시킬 수 있다. 은닉층 수가 10개인 XOR문제와 은닉층 수가 8개인 MNIST 분류문제를 통하여 기존 비선형활성함수와 파라메트릭활성함수의 성능을 비교하였고 제안한 파라메트릭 활성함수가 기울기 소실 완화에 우월한 성능을 가짐을 확인하였다.
영문내용 (English Abstract)	Deep neural networks are widely used to solve various problems. However, the deep neural network with a deep hidden layer frequently has a vanishing gradient or exploding gradient problem, which is a major obstacle to learning the deep neural network. In this paper, we propose a parametric activation function to alleviate the vanishing gradient problem that can be caused by nonlinear activation function. The proposed parametric activation function can be obtained by applying a parameter that can convert the scale and location of the activation function according to the characteristics of the input data, and the loss function can be minimized without limiting the derivative of the activation function through the backpropagation process. Through the XOR problem with 10 hidden layers and the MNIST classification problem with 8 hidden layers, the performance of the original nonlinear and parametric activation functions was compared, and it was confirmed that the proposed parametric activation function has superior performance in alleviating the vanishing gradient.
키워드(Keyword)	음악 추천 태그 자동 분류 소리 데이터 Music Recommendation Automatic Tag Classification Sound Data 심층신경망 기울기 소실 문제 파라메트릭 활성함수 역전파 학습 Deep Neural Network Vanishing Gradient Problem Parametric Activation Function Backpropagation Learning
파일첨부	PDF 다운로드