Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)
Current Result Document :
ÇѱÛÁ¦¸ñ(Korean Title) |
À§Å°Çǵð¾Æ ±â¹Ý °³Ã¼¸í »çÀü ¹ÝÀÚµ¿ ±¸Ãà ¹æ¹ý |
¿µ¹®Á¦¸ñ(English Title) |
A Semi-automatic Construction method of a Named Entity Dictionary Based on Wikipedia |
ÀúÀÚ(Author) |
¼Û¿µ±æ
Á¤¼®¿ø
±èÇмö
Yeongkil Song
Seokwon Jeong
Harksoo Kim
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 42 NO. 11 PP. 1397 ~ 1403 (2015. 11) |
Çѱ۳»¿ë (Korean Abstract) |
°³Ã¼¸íÀº ´Ù¾çÇÑ ÀÚ¿¬¾îó¸® ¿¬±¸ ¹× ¼ºñ½º¿¡ Áß¿äÇÑ Á¤º¸·Î ÀÌ¿ëµÈ´Ù. °³Ã¼¸í ÀνÄÀÇ ¼º´ÉÀ» Çâ»ó½ÃÅ°±â À§ÇÑ ¿©·¯ ¿¬±¸¿¡¼ °³Ã¼¸í »çÀüÀ» ÀÌ¿ëÇÑ ÀÚÁúÀÌ °³Ã¼¸í ÀÎ½Ä ¼º´É¿¡ Å« ¿µÇâÀ» Áشٴ °ÍÀ» º¸ÀÌ°í ÀÖ´Ù. ±×·¯³ª °³Ã¼¸í »çÀüÀ» ±¸ÃàÇÏ´Â °ÍÀº ¸Å¿ì ½Ã°£ ¼Ò¸ðÀûÀÌ°í, Àη ¼Ò¸ðÀûÀÎ ÀÛ¾÷ÀÌ´Ù. À̸¦ ¿ÏÈÇϱâ À§Çؼ º» ³í¹®¿¡¼´Â °³Ã¼¸í »çÀüÀ» ¹ÝÀÚµ¿À¸·Î ±¸ÃàÇÏ´Â ¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. Á¦¾È ½Ã½ºÅÛÀº ´Éµ¿ÇнÀÀ» ÀÌ¿ëÇÏ¿© À§Å°Çǵð¾Æ ºÐ·ùÁ¤º¸·Î ±¸¼ºµÈ °¡»ó ¹®¼¸¦ °³Ã¼¸í ¹üÁÖ ´ç Çϳª¾¿ »ý¼ºÇÑ´Ù. ±×¸®°í Àß ¾Ë·ÁÁø Á¤º¸°Ë»ö ¸ðµ¨ÀÎ BM25¸¦ ÀÌ¿ëÇÏ¿© À§Å°Çǵð¾Æ ¿£Æ®¸®¿Í °¡»ó¹®¼ »çÀÌÀÇ À¯»çµµ¸¦ °è»êÇÑ´Ù. ¸¶Áö¸·À¸·Î À¯»çµµ¸¦ ¹ÙÅÁÀ¸·Î °¢ À§Å°Çǵð¾Æ ¿£Æ®¸®¸¦ °³Ã¼¸í ¹üÁÖ·Î ºÐ·ùÇÑ´Ù. ¼·Î ´Ù¸¥ 3Á¾·ùÀÇ °³Ã¼¸í¹üÁÖ ÁýÇÕ¿¡¼ ½ÇÇèÇÑ °á°ú, Á¦¾È ½Ã½ºÅÛÀº ¸ÅÅ©·Î Æò±Õ F1-Á¡¼ö 0.9028, ¸¶ÀÌÅ©·Î Æò±Õ F1-Á¡¼ö 0.9554À̶ó´Â ³ôÀº ¼º´ÉÀ» º¸¿´´Ù.
|
¿µ¹®³»¿ë (English Abstract) |
A named entity(NE) dictionary is an important resource for the performance of NE recognition. However, it is not easy to construct a NE dictionary manually since human annotation is time consuming and labor-intensive. To save construction time and reduce human labor, we propose a semi-automatic system for the construction of a NE dictionary. The proposed system constructs a pseudo-document with Wiki-categories per NE class by using an active learning technique. Then, it calculates similarities between Wiki entries and pseudo-documents using the BM25 model, a well-known information retrieval model. Finally, it classifies each Wiki entry into NE classes based on similarities. In experiments with three different types of NE class sets, the proposed system showed high performance(macro-average F1-score of 0.9028 and micro-average F1-score 0.9554).
|
Å°¿öµå(Keyword) |
°³Ã¼¸í »çÀü ±¸Ãà
À§Å°Çǵð¾Æ
Á¤º¸°Ë»ö±â¹ý
´Éµ¿ÇнÀ
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|