Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)
Current Result Document : 1 / 1
ÇѱÛÁ¦¸ñ(Korean Title) |
RoBERTa¸¦ ÀÌ¿ëÇÑ Çѱ¹¾î ±â°èµ¶ÇØ |
¿µ¹®Á¦¸ñ(English Title) |
RoBERTa for Korean Machine Reading Comprehension |
ÀúÀÚ(Author) |
ÃÖÀ±¼ö
ÀÌÇý¿ì
±èÅÂÇü
ÀåµÎ¼º
ÀÌ¿µÈÆ
³ª½ÂÈÆ
Yun-Su Choi
Hye-Woo Lee
Tae-Hyeong Kim
Du-Seong Chang
Young-Hoon Lee
Seung-Hoon Na
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 27 NO. 04 PP. 0198 ~ 0203 (2021. 04) |
Çѱ۳»¿ë (Korean Abstract) |
±â°èµ¶ÇØ´Â ¹®´Ü¿¡¼ ÁÖ¾îÁø Áú¹®¿¡ ´ëÇÑ ´äÀ» ã´Â ÀÚ¿¬¾îó¸® taskÀÌ´Ù. ÃÖ±Ù BERT¿Í °°ÀÌ ´ë·®ÀÇ µ¥ÀÌÅÍ·Î ÇнÀÇÑ ¾ð¾î¸ðµ¨À» ÀÚ¿¬¾î󸮿¡ ÀÌ¿ëÇÏ´Â ¿¬±¸°¡ ÁøÇàµÇ°í ÀÖ´Ù. º» ³í¹®¿¡¼´Â ÅäÅ©³ª ÀÌ¡ ¹æ½ÄÀ» ÇüÅÂ¼Ò¿Í ÀÚ¼Ò ´ÜÀ§¸¦ °áÇÕÇÑ ÇüÅ µîÀ¸·Î º¯°æÇÏ°í RoBERTa ÇнÀ ¹× Æò°¡¸¦ ÁøÇàÇÏ¿© ÅäÅ©³ªÀÌ¡ ¹æ½Ä¿¡ µû¸¥ ¼º´É º¯È¸¦ º¸¾Ò´Ù. ±×¸®°í BERT¸¦ ¼öÁ¤ÇÑ RoBERTa ¸ðµ¨À» ÇнÀÇÏ°í ±â°èµ¶Çظ¦ À§ÇØ MCAF(Multi-level Co-Attention Fusion)¸¦ °áÇÕÇÑ ¸ðµ¨À» Á¦¾ÈÇÑ´Ù. Çѱ¹¾î ±â°èµ¶ÇØ µ¥ÀÌÅÍ ¼ÂÀÎ KorQuAD µ¥ÀÌÅ͸¦ ÀÌ¿ëÇÏ¿© ½ÇÇèÇÑ °á°ú dev ¼Â¿¡¼ EM 87.62%, F1 94.61%ÀÇ ¼º´ÉÀ» º¸¿´´Ù.
|
¿µ¹®³»¿ë (English Abstract) |
Machine reading comprehension is a natural language processing task that finds answers to a given question in a given paragraph. Currently, studies on language model trained with a large amount of data, such as BERT, for natural language processing are in progress. In this paper, we adapted a tokenizer capable of analyzing text in morpheme and grapheme level, conducted RoBERTa learning and evaluated the changes in benchmark scores depending on the variation of the tokenizing method. In addition, we have trained the RoBERTa model with a modified BERT and propose a model that combines the RoBERTa model with MCAF(Multi-level Co-Attention Fusion) for the purpose of machine reading comprehension. As a results, the experiments with KorQuAD, a korean machine reading comprehension development dataset, showed that EM is 87.62% and F1 is 94.61%.
|
Å°¿öµå(Keyword) |
±â°èµ¶ÇØ
¾ð¾î¸ðµ¨
ÅäÅ« ºÐ¸®
RoBERTa
machine reading comprehension
language model
tokenizing
RoBERTa
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|