• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) OpenMP µð¹ÙÀ̽º ÄÁ½ºÆ®·°Æ®ÀÇ CUDA ¼Ò½º ÄÚµå·ÎÀÇ º¯È¯ ¹× ·±Å¸ÀÓ ÃÖÀûÈ­ ±â¹ý
¿µ¹®Á¦¸ñ(English Title) Source-level Translation of OpenMP Device Constructs to CUDA and Runtime Optimization Methods
ÀúÀÚ(Author) ¹Ú´ë¿µ   ÀÌÀçÁø   Daeyoung Park   Jaejin Lee  
¿ø¹®¼ö·Ïó(Citation) VOL 27 NO. 02 PP. 0110 ~ 0115 (2021. 02)
Çѱ۳»¿ë
(Korean Abstract)
º» ³í¹®Àº OpenMP 4.5 device construct¸¦ ÀÌ¿ëÇÏ¿© °³¹ßµÈ C ¼Ò½º Äڵ带 ´ëÀÀÇÏ´Â CUDA ¼Ò½º ÄÚµå·Î º¯È¯ÇÏ´Â ÄÄÆÄÀÏ·¯¿Í À̸¦ Áö¿øÇÏ´Â ·±Å¸ÀÓ ½Ã½ºÅÛÀ» Á¦¾ÈÇÑ´Ù. ¸ÕÀú, OpenMPÀÇ ½ÇÇà ¸ðµ¨, ¸Þ¸ð¸® ¸ðµ¨ ¹× µ¿±âÈ­ °úÁ¤À» »ìÆ캸°í, source-level º¯È¯ÀÇ ¹æ¹ýÀ» ¼³¸íÇÑ´Ù. ¶ÇÇÑ, ¼º´É Çâ»óÀ» À§ÇØ °í¾ÈµÈ ¹öµð ÇÒ´çÀÚ, UDTE¿Í °°Àº ·±Å¸ÀÓ ½Ã½ºÅÛ ÃÖÀûÈ­ ±â¼úÀ» ¼Ò°³ÇÑ´Ù. ½ÇÇèÀº SPEC-ACCEL 1.2 º¥Ä¡¸¶Å©¸¦ ÀÌ¿ëÇÑ´Ù. ½ÇÇè °á°ú ºñ±³ ´ë»óÀÎ gcc7 ´ëºñ 6¹è ÀÌ»ó, mriq¸¦ Á¦¿ÜÇÑ °æ¿ì¿¡µµ 2¹è ÀÌ»óÀÇ ¼º´É Çâ»óÀ» °¡Á®¿Ô´Ù. º» ³í¹®ÀÇ ÇÁ·¹ÀÓ¿öÅ©¸¦ ¹ÙÅÁÀ¸·Î ÇâÈÄ ÄÄÆÄÀÏ·¯ ¹× ·±Å¸ÀÓ ÃÖÀûÈ­ ±â¼úÀ» Ãß°¡ÀûÀ¸·Î °³¹ßÇÒ ¼ö ÀÖÀ» °ÍÀ¸·Î ±â´ëµÈ´Ù
¿µ¹®³»¿ë
(English Abstract)
This paper deals with an OpenMP framework for GPU offloading. The framework is composed of a compiler and a runtime system that converts C programs written using the OpenMP 4.5 device construct to CUDA programs. First, we look at the execution model, memory model, and synchronization process of OpenMP, and explain how to translate in the source-level. Moreover, we use runtime optimization techniques such as buddy allocator, and UDTE to improve execution performance. Using the SPEC-ACCEL 1.2 benchmark suite, it shows up to 6 times better performance than the gcc7 framework. We expect that additional runtime and compiler optimization techniques can be applied based on the framework of this paper.
Å°¿öµå(Keyword) OpenMP   Device ¿ÀÇÁ·Îµù   CUDA   ¼Ò½º ÄÚµå º¯È¯   Runtime ÃÖÀûÈ­ ±â¹ý   device offloading   source-level translation   runtime optimization methods  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå