• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

Ȩ Ȩ > ¿¬±¸¹®Çå >

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) µ¥ÀÌÅÍ Áõ°­À» ÀÌ¿ëÇÑ Çѱ¹Ç¥ÁØ»ê¾÷ºÐ·ù ´Ù±¹¾î ºÐ·ù
¿µ¹®Á¦¸ñ(English Title) Korean Standard Industry Classification Multilingual Classification Using Data Augmentation
ÀúÀÚ(Author) ¿ìÂù±Õ   ¿ÀÁöÀº   ȲÁ¤À±   ¹ÚÀçÇö   ±èÁö¿ì   ¹ÚÁø¿µ   Chankyun Woo   Jieun Oh   Jeongyun Hwang   Jaehyeon Park   Jiwoo Kim   Jinyong Pak  
¿ø¹®¼ö·Ïó(Citation) VOL 49 NO. 02 PP. 0696 ~ 0698 (2022. 12)
Çѱ۳»¿ë
(Korean Abstract)
¿µ¹®³»¿ë
(English Abstract)
In this paper, we created a model that automatically classifies the foreign language survey industry classification items surveyed for foreigners in the Census conducted every five years by the Statistics Korea. A language model based on pre-training, which has been widely used recently, was used, and for multilingual classification, a classification model was constructed using XLM-R. Since the data to be used as learning data is in Korean, we first built a model in Korean and conducted a test in 6 languages1). Afterwards, the performance of the classification model according to the learning language was compared by machine translation of the learning data in Korean into English. As a result of comparison, the model that trained all 6 languages showed the best performance in the overall language with an average of 75%.
Å°¿öµå(Keyword)
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå