• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) À̱âÁ¾ ÇÁ·Î¼¼¼­¿¡¼­ÀÇ µö ·¯´× ÀÀ¿ë ¼º´É Çâ»óÀ» À§ÇÑ º´·ÄÈ­ ¹× ÆÄÀÌÇÁ¶óÀÌ´× ±â¹ý
¿µ¹®Á¦¸ñ(English Title) Accelerating a Deep Learning Application by Parallelization and Pipelining on Heterogeneous Processors
ÀúÀÚ(Author) »ï´Ï¿¨   Á¤ÀºÁø   ±èÀå·ü   ÀÌÀ缺   Çϼøȸ   Samnieng Tan   EunJin Jeong   Jangryul Kim   Jaeseong Lee   Soonhoi Ha  
¿ø¹®¼ö·Ïó(Citation) VOL 27 NO. 10 PP. 0497 ~ 0502 (2021. 10)
Çѱ۳»¿ë
(Korean Abstract)
ÀÓº£µðµå ½Ã½ºÅÛ¿¡¼­ µö ·¯´× ÀÀ¿ë¿¡ ´ëÇÑ ÇÊ¿ä°¡ Áõ°¡ÇÔ¿¡ µû¶ó, ÀÀ¿ëÀ» °¡¼ÓÇÏ´Â µ¥¿¡ À־ CPU°¡ ¾Æ´Ñ ó¸® ¿ä¼Ò(processing element)¸¦ ÀÓº£µðµå ±â±â¿¡ Æ÷ÇԵǰí ÀÖ´Ù. NVIDIA Jetson AGX Xavier´Â ´ëÇ¥ÀûÀÎ ¿¹Á¦·Î 8-core CPU »Ó¸¸ ¾Æ´Ï¶ó GPU¿Í 2°³ÀÇ µö·¯´× °¡¼Ó±â¸¦ ÇÔ²² °®°í À־ ÀÚ¿øÀÌ Á¦ÇÑµÈ È¯°æ¿¡¼­ µö ·¯´× ÀÀ¿ëÀÇ ¼º´ÉÀ» ²ø¾î¿Ã¸®´Â µ¥¿¡ È°¿ëµÈ´Ù. ÀÓº£µðµå ±â±â°¡ À̱âÁ¾Ã³¸® ¿ä¼Ò¸¦ Á¦°øÇÑ´Ù°í ÇÏ¿©µµ, ÀÌ·± ´Ù¾çÇÑ ¿ä¼ÒµéÀ» ÇÔ²² È°¿ëÇÏ¿© ¼º´ÉÀ» ¿Ã¸®´Â °ÍÀº »ó´çÇÑ ³ë·ÂÀ» ÇÊ¿ä·Î ÇÑ´Ù. º» ³í¹®¿¡¼­´Â ±âÁ¸ÀÇ Á¸ÀçÇÏ´Â ¿©·¯ ±â¹ýµé°ú ¿ì¸®°¡ Á¦¾ÈÇÏ´Â ³×Æ®¿öÅ© ÆÄÀÌÇÁ¶óÀÌ´× ±â¹ýÀ» ÇÔ²² Á¶ÇÕÇÏ¿© À̱âÁ¾ 󸮿ä¼Ò¸¦ °¡Áø Xavier¿¡¼­ µö ·¯´× ÀÀ¿ëÀÇ Ã³¸®·®À» ÃÖ´ëÈ­ ÇÏ´Â ±â¹ýÀ» Á¦¾ÈÇÑ´Ù. ¿©·¯ °³ÀÇ À̹ÌÁö ºÐ·ù ¿¹Á¦¿Í »ç¹° ÀÎ½Ä ¿¹Á¦¸¦ ÅëÇØ ÇϳªÀÇ GPU¸¦ »ç¿ëÇÏ´Â ±âÁ¸ÀÇ ¹æ¹ý ´ëºñ ÃÖ´ë 355%±îÁö ¼º´ÉÀÌ Çâ»óµÇ´Â °ÍÀ» È®ÀÎÇÏ¿´´Ù.
¿µ¹®³»¿ë
(English Abstract)
Since the need of deep learning applications in embedded systems is increasing, non-CPU processing elements are equipped on an embedded device to accelerate those applications. NVIDIA Jetson AGX Xavier (Xavier) is a representative example which not only has an octa-core CPU, but also has one powerful GPU and two deep learning accelerators to enhance the performance of deep learning inference on resource-constrained environments. Although an embedded device provides heterogeneous processing elements, utilizing diverse computation units is burdensome to increase performance. In this paper, we proposed a technique that combines multiple existing methods and our proposed network pipelining method to maximize the throughput of deep learning applications. Our network pipelining method is made for utilizing heterogeneous processing elements on the Xavier. Results of experiments with image classification and object detection examples revealed up to 355% improvement compared to baseline Frames Per Second (FPS) with a single GPU.
Å°¿öµå(Keyword) µö ·¯´×   À̱âÁ¾ ÇÁ·Î¼¼¼­   ÆÄÀÌÇÁ¶óÀÌ´×   º´·ÄÈ­   deep learning   heterogeneous processors   pipelining   parallelization  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå