ÇѱÛÁ¦¸ñ(Korean Title) |
¸Ó½Å·¯´× ±â¹Ý ½Ã°¢È ÃßõÀ» À§ÇÑ ¸ÞŸƯ¡ °øÇÐ |
¿µ¹®Á¦¸ñ(English Title) |
Meta-Feature Engineering for Machine Learning-based Visualization Recommendation |
ÀúÀÚ(Author) |
ÃÖÈñ¿ø
Hee-won Choi
±èÇÑÁØ
Han-joon Kim
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 36 NO. 02 PP. 0069 ~ 0083 (2020. 08) |
Çѱ۳»¿ë (Korean Abstract) |
º» ³í¹®Àº Ư¡°øÇÐÀ» ÀÌ¿ëÇÑ ÀÚµ¿ µ¥ÀÌÅÍ ½Ã°¢È ½Ã½ºÅÛÀÇ ½ÇÇö °¡´É¼ºÀ» È®ÀÎÇÏ°í, ½Ã°¢È Ãßõ ½Ã½ºÅÛÀÇ ±â¹ÝÀÌ µÇ´Â ¸ÞŸµ¥ÀÌÅÍ(Metadata) ¼³°è °úÁ¤À» ¼Ò°³ÇÑ´Ù. ÀÚµ¿ ½Ã°¢È ½Ã½ºÅÛÀ» ±¸ÃàÇϱ⿡ ¾Õ¼ ÁÖ¾îÁø ÀԷµ¥ÀÌÅͷκÎÅÍ Ç¥ÇöµÈ ½Ã°¢È °á°úÀÇ À¯Àǹ̼ºÀ» °áÁ¤ÇÏ´Â ¸ÞŸ¼º Ư¡ º¯¼ö¸¦ ÃßÃâÇÑ´Ù. ÀÌ °úÁ¤¿¡¼ ÆÇ´ÜÀÌ ¾Ö¸Å¸ðÈ£ÇÑ ¸·´ë±×·¡ÇÁ¿Í ¿ø±×·¡ÇÁÀÇ ÆÐÅÏÀ» ÇнÀÇϱâ À§ÇÑ ¸ÞŸƯ¡À» Á¦¾ÈÇÑ´Ù. ¶ÇÇÑ, ÀÚµ¿ ÀÌ»êÈ ±â¹ýÀÎ Æò±ÕÀ̵¿ ±ºÁýÈ(Mean-shift clustering)¸¦ Á¦¾ÈÇÔÀ¸·Î½á ¼öÄ¡Çü ¼Ó¼ºÀ» ¿ä¾àÁ¤º¸ ½Ã°¢È·Î Ç¥Çö °¡´ÉÇÏ°Ô ÇÑ´Ù. »ý¼ºÇÑ ¸ÞŸƯ¡µéÀÇ Áß¿äµµ°¡ SHAP(SHapley Additive exPlanation)À» ÅëÇØ Æò°¡µÇ¾úÀ¸¸ç, 48°³ÀÇ ¸ÞŸƯ¡ Áß¿¡¼ »óÀ§ 5°³ÀÇ Áß¿ä Ư¡ÀÌ µµÃâµÇ¾ú´Ù. ¶ÇÇÑ, ¿ì¸®´Â Æò±ÕÀ̵¿ ±ºÁýȸ¦ Àû¿ëÇÏ¿© ¿ä¾à µ¥ÀÌÅÍ·Î º¯È¯µÈ ¼öÄ¡Çü Ư¡°ªµéÀÌ À¯ÀǹÌÇÑ ½Ã°¢È °á°ú·Î »ý¼ºµÊÀ» º¸¿´´Ù. ´Ù¾çÇÑ Á¾·ùÀÇ µ¥ÀÌÅ͸¦ È°¿ëÇÑ ½ÇÇèÀ» ÅëÇÏ¿©, Á¦¾ÈµÈ ¸ÞŸƯ¡µéÀ» °¡Áö°í ÀÇ»ç°áÁ¤³ª¹«¿Í ·£´ýÆ÷·¹½ºÆ® ¾Ë°í¸®ÁòÀ» ÅëÇØ »ý¼ºµÈ Ãßõ ¸ðµ¨ÀÌ °¡Àå À¯ÀǹÌÇÑ ½Ã°¢È °á°ú¸¦ »ý¼ºÇÏ¿´´Ù.
|
¿µ¹®³»¿ë (English Abstract) |
This paper shows the feasibility of an automated data visualization system using feature engineering and describes a meta-feature design process that is the basis of a visualization recommendation system. Before building the automated visualization system, the meta-features are extracted so that the visualization results from given input data can be reasonable. In this paper, we propose a set of meta-features that contribute to learn the ambiguous patterns of bar plots and pie plots. Also, we propose a way of using mean-shift clustering so that numerical features can be expressed as summary data for more reasonable visualization. The proposed meta-features was evaluated through SHAP (SHapley Additive exPlanation) in terms of feature importance of learned model, and significant top-5 features were identified among 48 meta-features. In addition, we showed that numeric feature values are converted into summary data by applying mean-shift clustering, and they are expressed into meaningful visualization results. Through experiments using various types of data, a visualization recommendation model generated through the decision tree and random forest algorithms generated the most meaningful visualization results.
|
Å°¿öµå(Keyword) |
¸Ó½Å·¯´×
µ¥ÀÌÅÍ ½Ã°¢È
Ư¡°øÇÐ
¸ÞŸµ¥ÀÌÅÍ
Æò±ÕÀ̵¿ ±ºÁýÈ
Machine Learning
Data visualization
Feature Engineering
Metadata
Mean-shift clusteri
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|