Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)
Current Result Document : 12 / 16
ÇѱÛÁ¦¸ñ(Korean Title) |
¸Ê¸®µà½º¸¦ ÀÌ¿ëÇÑ ´ÙÁß Áß½ÉÁ¡ ÁýÇÕ ±â¹ÝÀÇ È¿À²ÀûÀΠŬ·¯½ºÅ͸µ ¹æ¹ý |
¿µ¹®Á¦¸ñ(English Title) |
An Efficient Clustering Method based on Multi Centroid Set using MapReduce |
ÀúÀÚ(Author) |
°¼º¹Î
À̼®ÁÖ
¹ÎÁرâ
Sungmin Kang
Seokjoo Lee
Jun-ki Min
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 21 NO. 07 PP. 0494 ~ 0499 (2015. 07) |
Çѱ۳»¿ë (Korean Abstract) |
µ¥ÀÌÅÍ »çÀÌÁî°¡ Áõ°¡ÇÔ¿¡ µû¶ó¼ ´ë¿ë·® µ¥ÀÌÅ͸¦ ºÐ¼®ÇÏ¿© µ¥ÀÌÅÍÀÇ Æ¯¼ºÀ» ÆľÇÇÏ´Â °ÍÀÌ ¸Å¿ì Áß¿äÇØÁ³´Ù. º» ³í¹®¿¡¼´Â ºÐ»ê º´·Ä ó¸® ÇÁ·¹ÀÓ¿öÅ©ÀÎ ¸Ê¸®µà½º¸¦ È°¿ëÇÑ k-Means Ŭ·¯½ºÅ͸µ ±â¹ÝÀÇ È¿°úÀûÀΠŬ·¯½ºÅ͸µ ±â¹ýÀÎ MCSK-Means (Multi centroid set k-Means)¾Ë°í¸®ÁòÀ» Á¦¾ÈÇÑ´Ù. k-Means ¾Ë°í¸®ÁòÀº ÀÓÀÇ·Î Á¤ÇØÁö´Â k°³ÀÇ Ãʱâ Áß½ÉÁ¡µéÀÇ À§Ä¡¿¡ µû¶ó¼ Ŭ·¯½ºÅ͸µ °á°úÀÇ Á¤È®µµ°¡ ¸¹Àº ¿µÇâÀ» ¹Þ´Â ¹®Á¦Á¡À» °¡Áö°í ÀÖ´Ù. ÀÌ·¯ÇÑ ¹®Á¦¸¦ ÇØ°áÇϱâ À§ÇÏ¿©, º» ³í¹®¿¡¼ Á¦¾ÈÇÏ´Â MCSK-Means ¾Ë°í¸®ÁòÀº k°³ÀÇ Áß½ÉÁ¡µé·Î ÀÌ·ç¾îÁø m°³ÀÇ Áß½ÉÁ¡ ÁýÇÕÀ» »ç¿ëÇÏ¿© ÀÓÀÇ·Î »ý¼ºµÇ´Â Ãʱâ Áß½ÉÁ¡ÀÇ ÀÇÁ¸µµ¸¦ ÁÙ¿´´Ù. ¶ÇÇÑ, Ŭ·¯½ºÅ͸µ ´Ü°è¸¦ °ÅÄ£ m°³ÀÇ Áß½ÉÁ¡ ÁýÇյ鿡 ¼ÓÇÑ Áß½ÉÁ¡µé¿¡ ´ëÇÏ¿© Á÷Á¢ °èÃþ Ŭ·¯½ºÅ͸µ ¾Ë°í¸®ÁòÀ» Àû¿ëÇÏ¿© k°³ÀÇ Å¬·¯½ºÅÍ Áß½ÉÁ¡µéÀ» »ý¼ºÇÏ¿´´Ù. º» ³í¹®¿¡¼´Â MCSK-Means ¾Ë°í¸®ÁòÀ» ¸Ê¸®µà½º ÇÁ·¹ÀÓ¿öÅ© ȯ°æ¿¡¼ °³¹ßÇÏ¿© ´ë¿ë·® µ¥ÀÌÅ͸¦ È¿À²ÀûÀ¸·Î ó¸®ÇÒ ¼ö ÀÖµµ·Ï ÇÏ¿´´Ù.
|
¿µ¹®³»¿ë (English Abstract) |
As the size of data increases, it becomes important to identify properties by analyzing big data. In this paper, we propose a k-Means based efficient clustering technique, called MCSKMeans (Multi centroid set k-Means), using distributed parallel processing framework MapReduce. A problem with the k-Means algorithm is that the accuracy of clustering depends on initial centroids created randomly. To alleviate this problem, the MCSK-Means algorithm reduces the dependency of initial centroids using sets consisting of k centroids. In addition, we apply the agglomerative hierarchical clustering technique for creating k centroids from centroids in m centroid sets which are the results of the clustering phase. In this paper, we implemented our MCSK-Means based on the MapReduce framework for processing big data efficiently.
|
Å°¿öµå(Keyword) |
µ¥ÀÌÅ͸¶ÀÌ´×
¸Ê¸®µà½º
k-Means ¾Ë°í¸®Áò
Ŭ·¯½ºÅ͸µ
ºòµ¥ÀÌÅÍ
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|