Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)
Current Result Document :
ÇѱÛÁ¦¸ñ(Korean Title) |
OpenMP µð¹ÙÀ̽º ÄÁ½ºÆ®·°Æ®ÀÇ CUDA ¼Ò½º ÄÚµå·ÎÀÇ º¯È¯ ¹× ·±Å¸ÀÓ ÃÖÀûÈ ±â¹ý |
¿µ¹®Á¦¸ñ(English Title) |
Source-level Translation of OpenMP Device Constructs to CUDA and Runtime Optimization Methods |
ÀúÀÚ(Author) |
¹Ú´ë¿µ
ÀÌÀçÁø
Daeyoung Park
Jaejin Lee
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 27 NO. 02 PP. 0110 ~ 0115 (2021. 02) |
Çѱ۳»¿ë (Korean Abstract) |
º» ³í¹®Àº OpenMP 4.5 device construct¸¦ ÀÌ¿ëÇÏ¿© °³¹ßµÈ C ¼Ò½º Äڵ带 ´ëÀÀÇÏ´Â CUDA ¼Ò½º ÄÚµå·Î º¯È¯ÇÏ´Â ÄÄÆÄÀÏ·¯¿Í À̸¦ Áö¿øÇÏ´Â ·±Å¸ÀÓ ½Ã½ºÅÛÀ» Á¦¾ÈÇÑ´Ù. ¸ÕÀú, OpenMPÀÇ ½ÇÇà ¸ðµ¨, ¸Þ¸ð¸® ¸ðµ¨ ¹× µ¿±âÈ °úÁ¤À» »ìÆ캸°í, source-level º¯È¯ÀÇ ¹æ¹ýÀ» ¼³¸íÇÑ´Ù. ¶ÇÇÑ, ¼º´É Çâ»óÀ» À§ÇØ °í¾ÈµÈ ¹öµð ÇÒ´çÀÚ, UDTE¿Í °°Àº ·±Å¸ÀÓ ½Ã½ºÅÛ ÃÖÀûÈ ±â¼úÀ» ¼Ò°³ÇÑ´Ù. ½ÇÇèÀº SPEC-ACCEL 1.2 º¥Ä¡¸¶Å©¸¦ ÀÌ¿ëÇÑ´Ù. ½ÇÇè °á°ú ºñ±³ ´ë»óÀÎ gcc7 ´ëºñ 6¹è ÀÌ»ó, mriq¸¦ Á¦¿ÜÇÑ °æ¿ì¿¡µµ 2¹è ÀÌ»óÀÇ ¼º´É Çâ»óÀ» °¡Á®¿Ô´Ù. º» ³í¹®ÀÇ ÇÁ·¹ÀÓ¿öÅ©¸¦ ¹ÙÅÁÀ¸·Î ÇâÈÄ ÄÄÆÄÀÏ·¯ ¹× ·±Å¸ÀÓ ÃÖÀûÈ ±â¼úÀ» Ãß°¡ÀûÀ¸·Î °³¹ßÇÒ ¼ö ÀÖÀ» °ÍÀ¸·Î ±â´ëµÈ´Ù
|
¿µ¹®³»¿ë (English Abstract) |
This paper deals with an OpenMP framework for GPU offloading. The framework is composed of a compiler and a runtime system that converts C programs written using the OpenMP 4.5 device construct to CUDA programs. First, we look at the execution model, memory model, and synchronization process of OpenMP, and explain how to translate in the source-level. Moreover, we use runtime optimization techniques such as buddy allocator, and UDTE to improve execution performance. Using the SPEC-ACCEL 1.2 benchmark suite, it shows up to 6 times better performance than the gcc7 framework. We expect that additional runtime and compiler optimization techniques can be applied based on the framework of this paper.
|
Å°¿öµå(Keyword) |
OpenMP
Device ¿ÀÇÁ·Îµù
CUDA
¼Ò½º ÄÚµå º¯È¯
Runtime ÃÖÀûÈ ±â¹ý
device offloading
source-level translation
runtime optimization methods
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|