• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸Ã³¸®ÇÐȸ ³í¹®Áö > Á¤º¸Ã³¸®ÇÐȸ ³í¹®Áö ÄÄÇ»ÅÍ ¹× Åë½Å½Ã½ºÅÛ

Á¤º¸Ã³¸®ÇÐȸ ³í¹®Áö ÄÄÇ»ÅÍ ¹× Åë½Å½Ã½ºÅÛ

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) ºÐ»ê µö·¯´×¿¡¼­ Åë½Å ¿À¹öÇìµå¸¦ ÁÙÀ̱â À§ÇØ ·¹À̾ ¿À¹ö·¡ÇÎÇÏ´Â ÇÏÀ̺긮µå ¿Ã-¸®µà½º ±â¹ý
¿µ¹®Á¦¸ñ(English Title) Hybrid All-Reduce Strategy with Layer Overlapping for Reducing Communication Overhead in Distributed Deep Learning
ÀúÀÚ(Author) ±è´ëÇö   ¿©»óÈ£   ¿À»óÀ±   Daehyun Kim   Sangho Yeo   Sangyoon Oh  
¿ø¹®¼ö·Ïó(Citation) VOL 10 NO. 07 PP. 0191 ~ 0198 (2021. 07)
Çѱ۳»¿ë
(Korean Abstract)
ºÐ»ê µö·¯´×Àº °¢ ³ëµå¿¡¼­ Áö¿ªÀûÀ¸·Î ¾÷µ¥ÀÌÆ®ÇÑ Áö¿ª ÆĶó¹ÌÅ͸¦ µ¿±âÈ­´Â °úÁ¤ÀÌ ¿ä±¸µÈ´Ù. º» ¿¬±¸¿¡¼­´Â ºÐ»ê µö·¯´×ÀÇ È¿°úÀûÀÎ ÆĶó¹ÌÅÍ µ¿±âÈ­ °úÁ¤À» À§ÇØ, ·¹ÀÌ¾î º° Ư¼ºÀ» °í·ÁÇÑ allreduce Åë½Å°ú ¿¬»ê ¿À¹ö·¡ÇÎ(overlapping) ±â¹ýÀ» Á¦¾ÈÇÑ´Ù. »óÀ§ ·¹À̾îÀÇ ÆĶó¹ÌÅÍ µ¿±âÈ­´Â ÇÏÀ§ ·¹À̾îÀÇ ´ÙÀ½ ÀüÆÄ°úÁ¤ ÀÌÀü±îÁö Åë½Å/°è»ê(ÇнÀ) ½Ã°£À» ¿À¹ö·¦ÇÏ¿© ÁøÇàÇÒ ¼ö ÀÖ´Ù. ¶ÇÇÑ À̹ÌÁö ºÐ·ù¸¦ À§ÇÑ ÀϹÝÀûÀÎ µö·¯´× ¸ðµ¨ÀÇ »óÀ§ ·¹À̾î´Â convolution ·¹À̾î, ÇÏÀ§ ·¹À̾î´Â fully-connected ·¹À̾î·Î ±¸¼ºµÇ¾î ÀÖ´Ù. Convolution ·¹À̾î´Â fully-connected ·¹ÀÌ¾î ´ëºñ ÀûÀº ¼öÀÇ ÆĶó¹ÌÅ͸¦ °¡Áö°í ÀÖ°í »óÀ§¿¡ ·¹À̾ À§Ä¡ÇϹǷΠ³×Æ®¿öÅ© ¿À¹ö·¦ Çã¿ë½Ã°£ÀÌ Âª°í, À̸¦ °í·ÁÇÏ¿© ³×Æ®¿öÅ© Áö¿¬½Ã°£À» ´ÜÃàÇÒ ¼ö ÀÖ´Â butterfly all-reduce¸¦ »ç¿ëÇÏ´Â °ÍÀÌ È¿°úÀûÀÌ´Ù. ¹Ý¸é ¿À¹ö·¦ Çã¿ë½Ã°£ÀÌ º¸´Ù ±ä °æ¿ì, ³×Æ®¿öÅ© ´ë¿ªÆøÀ» °í·ÁÇÑ ring all-reduce¸¦ »ç¿ëÇÑ´Ù. º» ³í¹®ÀÇ Á¦¾È ¹æ¹ýÀÇ È¿°ú¸¦ °ËÁõÇϱâ À§ÇØ Á¦¾È ¹æ¹ýÀ» PyTorch Ç÷§Æû¿¡ Àû¿ëÇÏ¿© À̸¦ ±â¹ÝÀ¸·Î ½ÇÇè ȯ°æÀ» ±¸¼ºÇÏ¿© ¹èÄ¡Å©±â¿¡ ´ëÇÑ ¼º´É Æò°¡¸¦ ÁøÇàÇÏ¿´´Ù. ½ÇÇèÀ» ÅëÇØ Á¦¾È ±â¹ýÀÇ ÇнÀ½Ã°£Àº ±âÁ¸ PyTorch ¹æ½Ä ´ëºñ ÃÖ°í 33% ´ÜÃàµÈ ¸ð½ÀÀ» È®ÀÎÇÏ¿´´Ù.
¿µ¹®³»¿ë
(English Abstract)
Since the size of training dataset become large and the model is getting deeper to achieve high accuracy in deep learning, the deep neural network training requires a lot of computation and it takes too much time with a single node. Therefore, distributed deep learning is proposed to reduce the training time by distributing computation across multiple nodes. In this study, we propose hybrid allreduce strategy that considers the characteristics of each layer and communication and computational overlapping technique for synchronization of distributed deep learning. Since the convolution layer has fewer parameters than the fully-connected layer as well as it is located at the upper, only short overlapping time is allowed. Thus, butterfly allreduce is used to synchronize the convolution layer. On the other hand, fully-connecter layer is synchronized using ring all-reduce. The empirical experimentresults on PyTorch with our proposed scheme shows that the proposed method reduced the training time by up to 33% compared to the baseline PyTorch.
Å°¿öµå(Keyword) ºÐ»êµö·¯´×   µ¿±âÈ­   ·¹ÀÌ¾î ¿À¹ö·¡ÇΠ  ¿Ã¸®µà½º   Distributed Deep Learning   Synchronization   Layer Overlapping   Allreduce  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå