Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)
Current Result Document :
ÇѱÛÁ¦¸ñ(Korean Title) |
ÀÎ-¸Þ¸ð¸® ºÐ¼® ÇÁ·¹ÀÓ¿öÅ©ÀÇ Ä³½Ã ¼º´É ÀÌµæ ¿¹Ãø |
¿µ¹®Á¦¸ñ(English Title) |
Predicting the Cache Performance Benefits for In-memory Data Analytics Frameworks |
ÀúÀÚ(Author) |
Á¤¹Î¼·
ÇÑȯ¼ö
Minseop Jeong
Hwansoo Han
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 48 NO. 05 PP. 0479 ~ 0485 (2021. 05) |
Çѱ۳»¿ë (Korean Abstract) |
ÀÎ-¸Þ¸ð¸® µ¥ÀÌÅÍ ºÐ¼® ÇÁ·¹ÀÓ¿öÅ©¿¡¼ ¼º´É °³¼±À» À§ÇØ °è»êµÈ Áß°£°ªÀ» ij½ÃÇÏ´Â ±â´ÉÀ» Á¦°øÇÑ´Ù. ¾ÖÇø®ÄÉÀ̼ǿ¡¼ º¸´Ù È¿°úÀûÀ¸·Î ij½ÌÇϱâ À§Çؼ´Â ÀÌ·Î ÀÎÇÑ ¼º´É À̵æÀÌ °í·ÁµÇ¾î¾ß ÇÑ´Ù. ±âÁ¸ ÇÁ·¹ÀÓ¿öÅ©´Â ºÐ»ê ÀÛ¾÷ ¼öÁØÀÇ ½ÇÇà ½Ã°£¸¸À» ÃøÁ¤Çϱ⿡ ¾ÖÇø®ÄÉÀ̼ÇÀÇ Ä³½Ã ¼º´É À̵æÀ» ¿¹ÃøÇϱ⿡´Â Á¦¾àÀÌ ÀÖ´Ù. º» ³í¹®¿¡¼´Â ±âÁ¸ÀÇ task ¼öÁØ ½ÇÇà ½Ã°£ ÃøÁ¤¹ýÀ» º´ÇÕÇÑ ¿¬»êÀÚ ¼öÁØÀÇ ½Ã°£ ÃøÁ¤¹ý°ú ÀÎDz µ¥ÀÌÅÍ Å©±â¿¡ µû¶ó ÇÔ¼ö ºñ¿ëÀ» ¿¹ÃøÇÏ´Â ¸ðµ¨À» Á¦¾ÈÇÑ´Ù. ¶ÇÇÑ, Á¦¾ÈÇÑ ¸ðµ¨°ú ¾ÖÇø®ÄÉÀ̼ÇÀÇ ½ÇÇà È帧À» ±â¹ÝÀ¸·Î ij½ÌµÈ µ¥ÀÌÅͼÂÀ¸·Î ÀÎÇÑ ¼º´É ÀÌµæ ¿¹Ãø¹ýµµ Á¦¾ÈÇÑ´Ù. Á¦¾ÈÇÑ ¸ðµ¨°ú ¿¹Ãø¹ýÀº ij½Ã ¼º´É À̵æÀ» °í·ÁÇÑ Ä³½Ì ÃÖÀûÈÀÇ ±âȸ¸¦ Á¦°øÇÑ´Ù. Á¦¾ÈÇÑ ¿¬»êºñ¿ë¸ðµ¨Àº 10x ÀÎDz µ¥ÀÌÅÍ¿¡¼ Æò±Õ 7.3%ÀÇ ¿ÀÂ÷¸¦ º¸¿´À¸¸ç, ¸ðµ¨À» ÅëÇØ ¿¹ÃøÇÑ ¼º´É À̵æÀº ½ÇÁ¦ ¼º´É À̵æ°ú 24% À̳»ÀÇ Â÷À̸¦ º¸¿´´Ù.
|
¿µ¹®³»¿ë (English Abstract) |
In-memory data analytics frameworks provide intermediate results in caching facilities for performance. For effective caching, the actual performance benefits from cached data should be taken into consideration. As existing frameworks only measure execution times at the distributed task level, they have limitations in predicting the cache performance benefits accurately. In this paper, we propose an operator-level time measurement method, which incorporates the existing task-level execution time measurement with our cost prediction model according to input data sizes. Based on the proposed model and the execution flow of the application, we propose a prediction method for the performance benefits from data caching. Our proposed model provides opportunities for cache optimization with predicted performance benefits. Our cost model for operators showed prediction error rate of 7.3% on average, when measured with 10x input data. The difference between predicted performance and actual performance wes limited to within 24%.
|
Å°¿öµå(Keyword) |
ij½Ì
¿¬»êºñ¿ë¸ðµ¨
¼º´É ÀÌµæ ¿¹Ãø
ºÐ»ê µ¥ÀÌÅÍ Ã³¸®
¾ÆÆÄÄ¡ ½ºÆÄÅ©
½Ã½ºÅÛ ÃÖÀûÈ
caching
computing cost model
performance benefit prediction
parallel data anal
apache spark
system optimization
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|