Á¤º¸°úÇÐȸ ³í¹®Áö B : ¼ÒÇÁÆ®¿þ¾î ¹× ÀÀ¿ë
ÇѱÛÁ¦¸ñ(Korean Title) |
ÁúÀÇÀÀ´ä¿¡¼ À§Å°Çǵð¾Æ ÀÎÆ÷¹Ú½º¿¡¼ÀÇ ´äº¯ ÃßÃâÀ» À§ÇÑ ÆäÀÌÁö Á¦¸ñ°ú ÀÎÆ÷¹Ú½º ¼Ó¼º ÀÎ½Ä |
¿µ¹®Á¦¸ñ(English Title) |
Recognizing Page Title and Infobox Attribute for Answer Extraction from Wikipedia Infobox |
ÀúÀÚ(Author) |
ÇãÁ¤
·ù¹ý¸ð
±èÇö±â
¹Ú»ó±Ô
¿Áö¿µ
Jeong Heo
Pum Mo Ryu
Hyun Ki Kim
Sang Kyu Park
Cheol Young Ock
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 40 NO. 09 PP. 0544 ~ 0557 (2013. 09) |
Çѱ۳»¿ë (Korean Abstract) |
º» ³í¹®¿¡¼´Â À§Å°Çǵð¾Æ ÀÎÆ÷¹Ú½º ÁúÀÇÀÀ´äÀÇ Áú¹®ºÐ¼®À» À§ÇÑ ÆäÀÌÁö Á¦¸ñ Àνİú ÀÎÆ÷¹Ú½º ¼Ó¼ºÁ¦¾à ¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. À§Å°Çǵð¾Æ´Â ¹Ý±¸Á¶ÈµÈ Áö½ÄÁ¤º¸·Î¼, ÆäÀÌÁö Á¦¸ñ, º»¹®, ÀÎÆ÷¹Ú½º µîÀÇ Á¤º¸°¡ Æ÷ÇԵǾî ÀÖ´Ù. ƯÈ÷ ÀÎÆ÷¹Ú½º´Â ÆäÀÌÁö Á¦¸ñ°ú °ü·ÃµÈ Áß¿äÁ¤º¸¸¦ Å×À̺íÇü½ÄÀÇ ±¸Á¶ÈµÈ ¹æ½ÄÀ¸·Î ±â¼úÇÏ°í ÀÖ´Ù. µû¶ó¼, À§Å°Çǵð¾Æ ÀÎÆ÷¹Ú½º ÁúÀÇÀÀ´äÀ» À§ÇØ Áú¹®¿¡ Æ÷ÇÔµÈ À§Å°Çǵð¾Æ ÆäÀÌÁö Á¦¸ñ°ú ÀÎÆ÷¹Ú½º ¼Ó¼ºÁ¤º¸¸¦ ÀνÄÇÏ´Â °ÍÀÌ ¸Å¿ì Áß¿äÇÏ´Ù. º» ³í¹®Àº ÆäÀÌÁö Á¦¸ñ Àνİú ÀÎÆ÷¹Ú½º ¼Ó¼ºÁ¤º¸ ÀνÄÀ» À§ÇØ ¸í»ç±âÁØ °¡º¯±æÀÌ ½½¶óÀ̵ù À©µµ¿ì ¹æ¹ý°ú ¾îÈÖ-ÀÇ¹Ì ÆÐÅÏÀ» ÀÌ¿ëÇÑ ¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. ±×¸®°í, ÆäÀÌÁö Á¦¸ñ ÀÎ½Ä Çâ»óÀ» À§ÇÑ À½Àý±âÁØ °¡º¯±æÀÌ ½½¶óÀ̵ù À©µµ¿ì ¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. ÀÎÆ÷¹Ú½º ¼Ó¼º Á¦¾àÀ» À§ÇØ Á¤´äÀ¯Çü¿¡ ±â¹ÝÇÑ Á¦¾à¹æ¹ýÀ» Á¦¾ÈÇÑ´Ù. Æò°¡µ¥ÀÌÅÍ·Î À§Å°Çǵð¾Æ ÀÎÆ÷¹Ú½º¸¦ ´ë»óÀ¸·Î ÇÑ Áú¹® 398°³¸¦ ¼öÀÛ¾÷À¸·Î ±¸ÃàÇÏ¿´´Ù. ½ÇÇè°á°ú, Áú¹® ³» ÆäÀÌÁö Á¦¸ñ°ú ÀÎÆ÷¹Ú½º ¼Ó¼º ½ÖÀÇ ÀÎ½Ä Á¤¹Ðµµ°¡ 60.05%¿´´Ù. ÀÌ´Â À§Å°Çǵð¾Æ ÀÎÆ÷¹Ú½º¸¦ ´ë»óÀ¸·Î ÇÑ Áú¹®ÀÇ ¾à 60%´Â ÆäÀÌÁö ¶Ç´Â ´Ü¶ô°Ë»ö°ú Á¤´ä ÃßÃâ ¾øÀ̵µ Á¤´äÃßÃâÀÌ °¡´ÉÇÔÀ» ÀǹÌÇÑ´Ù. |
¿µ¹®³»¿ë (English Abstract) |
Concerning the question analysis for Wikipedia Infobox Q&A, this paper proposes a method for recognizing the title of a Wikipedia page, and restricting the Infobox attributes. Wikipedia is a semi-structured knowledge source which incorporates variety of information, such as titles, contents, and Infobox. Infobox is especially significant since it describes title-related information in a structured fashion using tables. Therefore, to successfully perform Wikipedia Infobox Q&A, it is essential to recognize titles and Infobox attributes included in the queries. This paper proposes noun-based variable-length sliding window method and lexico-semantic pattern method for the respective recognition tasks. To further increase the performance of title recognition, we additionally use syllable-based variable-length sliding window method. To restrict the space of Infobox attributes, we apply a method based on answer types. 398 Infobox-related questions were manually constructed for evaluation. Experiments showed that the precision for recognizing titles and Infobox attributes in the questions was 60.05%. This suggests that approximately 60% of the Infobox-related questions could be answered without having to search and extract answers from the contents. |
Å°¿öµå(Keyword) |
À§Å°Çǵð¾Æ ÁúÀÇÀÀ´ä
Áú¹®ºÐ¼®
À§Å°Çǵð¾Æ ÀÎÆ÷¹Ú½º
ÀÎÆ÷¹Ú½º ¼Ó¼ºÁ¦¾à
question and answering for wikipedia
question analysis
wikipedia infobox
restriction of infobox attribute
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|