• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ÇÐȸÁö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ÇÐȸÁö > µ¥ÀÌÅͺ£À̽º ¿¬±¸È¸Áö(SIGDB)

µ¥ÀÌÅͺ£À̽º ¿¬±¸È¸Áö(SIGDB)

Current Result Document : 1 / 1

ÇѱÛÁ¦¸ñ(Korean Title) ÁÖÁ¦ ±â¹Ý ´º½º ±â»ç ¼öÁýÀ» À§ÇÑ ¸ÞŸ ¼Ó¼º À¶ÇÕÇü ±â°èÇнÀ ¾ÆÅ°ÅØó
¿µ¹®Á¦¸ñ(English Title) A Machine Learning Architecture Incorporating Meta-features for Topical News Filtering
ÀúÀÚ(Author) ±èÅÂÁØ   ±èÇÑÁØ   Tae-jun Kim   Han-joon Kim  
¿ø¹®¼ö·Ïó(Citation) VOL 33 NO. 01 PP. 0003 ~ 0014 (2017. 04)
Çѱ۳»¿ë
(Korean Abstract)
±âÁ¸ÀÇ Å°¿öµå ¸ÅĪÀ» ÅëÇÑ ÁÖÁ¦ ±â¹Ý Å©·Ñ¸µ(topical crawling) ±â¹ýÀº ÁÖ¾îÁø ÁÖÁ¦¿¡¼­ ¹þ¾î³­ ´Ù¼öÀÇ ¹®¼­µéÀ» ¼öÁýÇÏ´Â ¹®Á¦Á¡À» ¾È°í ÀÖ´Ù. º» ³í¹®Àº È­Àç »ç°Ç°ú °ü·Ã ¾ø´Â ´º½º ±â»ç¸¦ °É·¯ ³»±â À§ÇØ ±âÁ¸ bag-of-words ÇüÅÂÀÇ ¼Ó¼º°ú ¸ÞŸ ¼Ó¼º µ¥ÀÌÅ͸¦ À¶ÇÕÇÑ ÇüÅÂÀÇ ¼Ó¼º ÁýÇÕÀ» °í·ÁÇÑ ¾Ó»óºí °úÁ¤À» ¼öÇàÇÏ´Â È¿°úÀûÀÎ ±â°èÇнÀ ¾ÆÅ°ÅØó¸¦ Á¦¾ÈÇÑ´Ù. µÎ °¡Áö À¯ÇüÀÇ ¼Ó¼ºÀ» ´Ù¾çÇÑ ±â°èÇнÀ ¾Ë°í¸®Áò¿¡ ¹Ý¿µÇÏ¿© ¾òÀº ¿©·¯ ÇнÀ ¸ðµ¨µéÀº ÀûÀýÇÑ ¾Ó»óºí °úÁ¤À» °ÅÃÄ ÁÖÁ¦ ±â¹Ý Å©·Ñ¸µÀ» À§ÇÑ È¿°úÀûÀÎ ÇÊÅ͸µ ÀÛ¾÷¿¡ ±â¿©ÇÑ´Ù. Á¦¾È ±â¹ýÀÇ ¾Ó»óºí ¸ðµ¨Àº ±âÁ¸ ±â¹ýÀÇ ºÐ·ù ¸ðµ¨º¸´Ù ¿ì¼öÇÑ ¼º´ÉÀ» º¸¿´´Ù. ±¸Ã¼ÀûÀ¸·Î ÀÌ´Â ±âÁ¸ ÃÖ°íÀÇ ¼º´ÉÀ» º¸ÀÌ´Â ³ªÀÌºê º£ÀÌÁî ±â¹Ý ¸ðµ¨º¸´Ù Á¤¹Ðµµ Ãø¸é¿¡¼­ 8.1% ´õ ³ôÀº 93.9%, F1 ÃøÁ¤Ä¡ Ãø¸é¿¡¼­ 1% ´õ ³ôÀº 91.1% ±â·Ï ÇÏ¿´´Ù. ¶ÇÇÑ, Á¦¾È ±â¹ýÀ¸·Î ¾ò¾îÁø ÇнÀ ¸ðµ¨Àº ÇÊÅ͸µ¿¡ º¸´Ù ÀûÇÕÇÑ Á¤¹Ðµµ-ÀçÇöÀ² °î¼± (precision-recall curve)À» º¸¿´´Ù.
¿µ¹®³»¿ë
(English Abstract)
The existing topical crawling method using keyword matching has a problem of collecting a number of documents deviating from a given topic. In this paper, we propose an effective machine learning architecture that performs an ensemble process considering a set of attributes that combine attributes of the bag-of-words type and meta-attribute data in order to filter out news articles that are not related to fire events. Several learning models, obtained by reflecting two types of attributes into various machine learning algorithms, contribute to the effective filtering job for topic-based crawling via proper ensemble process of learned models. The ensemble model of the proposed method shows better performance than the conventional method; specifically, it was 8.1% higher in accuracy and 1% higher in terms of F1-score than the naive Bayes model with the highest performance. In addition, the learned model obtained by the proposed method showed a better precision-recall curve for filtering.
Å°¿öµå(Keyword) ±â°è ÇнÀ   ¹®¼­ ºÐ·ù   ¼Ó¼º ¿£Áö´Ï¾î¸µ   ¾Ó»óºí   À¥ Å©·Ñ¸µ   bag-of-words   machine learning   text classification   feature engineering   ensemble   web crawling   bag-of-words  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå