Á¤º¸Ã³¸®ÇÐȸ ³í¹®Áö ÄÄÇ»ÅÍ ¹× Åë½Å½Ã½ºÅÛ
ÇѱÛÁ¦¸ñ(Korean Title) |
GPGPU¸¦ È°¿ëÇÑ ½ºÆÄÅ© ±â¹Ý °ø°£ ¿¬»ê |
¿µ¹®Á¦¸ñ(English Title) |
Spatial Computation on Spark Using GPGPU |
ÀúÀÚ(Author) |
¼ÕÂù½Â
±è´ëÈñ
¹Ú´É¼ö
Chanseung Son
Daehee Kim
Neungsoo Park
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 05 NO. 08 PP. 0181 ~ 0188 (2016. 08) |
Çѱ۳»¿ë (Korean Abstract) |
ÃÖ±Ù ±Þ°ÝÈ÷ Áõ°¡ÇÏ´Â °ø°£ µ¥ÀÌÅ͸¦ È¿À²ÀûÀ¸·Î ó¸®Çϱâ À§ÇØ ¸¹Àº ¿¬±¸µéÀÌ ÁøÇàµÇ°í ÀÖ´Ù. ±âÁ¸ °ü°èÇü µ¥ÀÌÅͺ£À̽º ½Ã½ºÅÛÀ» È®ÀåÇÑ °ø°£ µ¥ÀÌÅͺ£À̽º ½Ã½ºÅÛÀº È®À强¿¡ ´ëÇÑ ¹®Á¦°¡ ÀÖÀ¸¸ç, ºÐ»ê ó¸® Ç÷§ÆûÀÎ ÇϵÓÀ» È®ÀåÇÑ SpatialHadoopÀº Áß°£ ¿¬»ê °á°ú¸¦ µð½ºÅ©¿¡ ÀÛ¼ºÇϱ⠶§¹®¿¡ ÆÄÀÏ ÀÔÃâ·ÂÀÇ ¿À¹öÇìµå·Î ¼º´ÉÀÌ ÀúÇϵǴ ¹®Á¦°¡ ÀÖ´Ù. º» ³í¹®Àº ÀÎ-¸Þ¸ð¸® ±â¹Ý ºÐ»ê ó¸® ÇÁ·¹ÀÓ¿öÅ©ÀÎ ½ºÆÄÅ©¸¦ È®ÀåÇÑ °ø°£ ¿¬»ê ½ºÆÄÅ©¸¦ Á¦¾ÈÇÏ¿´´Ù. ¶ÇÇÑ °ø°£ ¿¬»ê ½ºÆÄÅ©ÀÇ ¼º´ÉÀ» Çâ»ó½ÃÅ°±â À§ÇÏ¿© GPGPU¸¦ °áÇÕÇÑ ¸ðµ¨À» °³¹ßÇÏ¿´´Ù. °ø°£ ¿¬»ê ½ºÆÄÅ©´Â Áß°£ ¿¬»ê °á°ú¸¦ ¸Þ¸ð¸®¿¡ À¯Áö½ÃÅ°´Â ½ºÆÄÅ©ÀÇ Æ¯Â¡À» ±×´ë·Î »ç¿ëÇÏ°í ÀÖÀ¸¸ç, GPGPU ±â¹Ý °ø°£ ¿¬»ê ½ºÆÄÅ©ÀÇ °æ¿ì ´Ù¼öÀÇ PE¸¦ ÀÌ¿ëÇÏ¿© º´·Äó¸®Çϱ⠶§¹®¿¡ È¿À²ÀûÀ¸·Î °ø°£ ¿¬»êÀ» ¼öÇàÇÒ ¼ö ÀÖ´Ù. º» ³í¹®Àº ´ÜÀÏ AMD ½Ã½ºÅÛ¿¡¼ °ø°£ ¿¬»ê ½ºÆÄÅ©¿Í GPGPU ±â¹Ý °ø°£ ¿¬»ê ½ºÆÄÅ©¸¦ ±¸ÇöÇÏ¿´´Ù. °ø°£ ¿¬»ê ½ºÆÄÅ©¿Í GPGPU ±â¹Ý °ø°£ ¿¬»ê ½ºÆÄÅ©ÀÇ ¼º´ÉÀ» Æò°¡Çϱâ À§ÇÏ¿© Point-in-Polygon ¿¬»ê°ú Spatial Join ¿¬»êÀ» ¼öÇàÇÏ¿´À¸¸ç, SpatialHadoop¿¡ ºñÇÏ¿© ÃÖ´ë 8¹èÀÇ ¼º´É Çâ»óÀ» È®ÀÎÇÏ¿´´Ù. |
¿µ¹®³»¿ë (English Abstract) |
Recently, as the amount of spatial information increases, an interest in the study of spatial information processing has been increased. Spatial database systems extended from the traditional relational database systems are difficult to handle large data sets because of the scalability. SpatialHadoop extended from Hadoop system has a low performance, because spatial computations in SpationHadoop require a lot of write operations of intermediate results to the disk, resulting in the performance degradation. In this paper, Spatial Computation Spark(SC-Spark) is proposed, which is an in-memory based distributed processing framework. SC-Spark is extended from Spark in order to efficiently perform the spatial operation for large-scale data. In addition, SC-Spark based on the GPGPU is developed to improve the performance of the SC-Spark. SC-Spark uses the advantage of the Spark holding intermediate results in the memory. And GPGPU-based SC-Spark can perform spatial operations in parallel using a plurality of processing elements of an GPU. To verify the proposed work, experiments on a single AMD system were performed using SC-Spark and GPGPU-based SC-Spark for Point-in-Polygon and spatial join operation. The experimental results showed that the performance of SC-Spark and GPGPU-based SC-Spark were up-to 8 times faster than SpatialHadoop. |
Å°¿öµå(Keyword) |
½ºÆÄÅ©
OpenCL
ºò µ¥ÀÌÅÍ
°ø°£ µ¥ÀÌÅÍ
GPGPU
Spark
OpenCL
Big Data
Spatial Data
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|