Audio and Video Bimodal Emotion Recognition in Social Networks Based on Improved AlexNet Network and Attention Mechanism

Min Liu; Jun Tang

연구문헌

영문 논문지

홈 > 연구문헌 > 영문 논문지 > JIPS (한국정보처리학회)

JIPS (한국정보처리학회)

Current Result Document : 86 / 87

한글제목(Korean Title)	Audio and Video Bimodal Emotion Recognition in Social Networks Based on Improved AlexNet Network and Attention Mechanism
영문제목(English Title)	Audio and Video Bimodal Emotion Recognition in Social Networks Based on Improved AlexNet Network and Attention Mechanism
저자(Author)	Min Liu Jun Tang
원문수록처(Citation)	VOL 17 NO. 04 PP. 0754 ~ 0771 (2021. 08)
한글내용 (Korean Abstract)
영문내용 (English Abstract)	In the task of continuous dimension emotion recognition, the parts that highlight the emotional expression are not the same in each mode, and the influences of different modes on the emotional state is also different. Therefore, this paper studies the fusion of the two most important modes in emotional recognition (voice and visual expression), and proposes a two-mode dual-modal emotion recognition method combined with the attention mechanism of the improved AlexNet network. After a simple preprocessing of the audio signal and the video signal, respectively, the first step is to use the prior knowledge to realize the extraction of audio characteristics. Then, facial expression features are extracted by the improved AlexNet network. Finally, the multimodal attention mechanism is used to fuse facial expression features and audio features, and the improved loss function is used to optimize the modal missing problem, so as to improve the robustness of the model and the performance of emotion recognition. The experimental results show that the concordance coefficient of the proposed model in the two dimensions of arousal and valence (concordance correlation coefficient) were 0.729 and 0.718, respectively, which are superior to several comparative algorithms.
키워드(Keyword)	AlexNet Networks Attention Mechanism Concordance Correlation Coefficient Deep Learning Feature Layer Fusion Multimodal Emotion Recognition Social Networks
파일첨부	PDF 다운로드