• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

¿µ¹® ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ¿µ¹® ³í¹®Áö > TIIS (Çѱ¹ÀÎÅͳÝÁ¤º¸ÇÐȸ)

TIIS (Çѱ¹ÀÎÅͳÝÁ¤º¸ÇÐȸ)

Current Result Document : 6 / 8

ÇѱÛÁ¦¸ñ(Korean Title) Robustness of Differentiable Neural Computer Using Limited Retention Vectorbased Memory Deallocation in Language Model
¿µ¹®Á¦¸ñ(English Title) Robustness of Differentiable Neural Computer Using Limited Retention Vectorbased Memory Deallocation in Language Model
ÀúÀÚ(Author) Donghyun Lee   Hosung Park   Soonshin Seo   Hyunsoo Son   Gyujin Kim   and Ji-Hwan Kim  
¿ø¹®¼ö·Ïó(Citation) VOL 15 NO. 03 PP. 0837 ~ 0852 (2021. 03)
Çѱ۳»¿ë
(Korean Abstract)
¿µ¹®³»¿ë
(English Abstract)
Recurrent neural network (RNN) architectures have been used for language modeling (LM) tasks that require learning long-range word or character sequences. However, the RNN architecture is still suffered from unstable gradients on long-range sequences. To address the issue of long-range sequences, an attention mechanism has been used, showing state-of-theart (SOTA) performance in all LM tasks. A differentiable neural computer (DNC) is a deep learning architecture using an attention mechanism. The DNC architecture is a neural network augmented with a content-addressable external memory. However, in the write operation, some information unrelated to the input word remains in memory. Moreover, DNCs have been found to perform poorly with low numbers of weight parameters. Therefore, we propose a robust memory deallocation method using a limited retention vector. The limited retention vector determines whether the network increases or decreases its usage of information in external memory according to a threshold. We experimentally evaluate the robustness of a DNC implementing the proposed approach according to the size of the controller and external memory on the enwik8 LM task. When we decreased the number of weight parameters by 32.47%, the proposed DNC showed a low bits-per-character (BPC) degradation of 4.30%, demonstrating the effectiveness of our approach in language modeling tasks.
Å°¿öµå(Keyword) Differentiable Neural Computer (DNC)   Language Model (LM)   Memory Deallocation   Retention Vector   Robustness  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå