• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Á¤º¸°úÇÐȸ³í¹®Áö (Journal of KIISE)

Current Result Document : 246 / 247

ÇѱÛÁ¦¸ñ(Korean Title) À½Çâ½ÅÈ£ ¾ÐÃàÀ» À§ÇÑ ½ÉÃþ¸Á ±¸¼º°ú Á¾´Ü°£ ÇнÀ
¿µ¹®Á¦¸ñ(English Title) Deep Neural Networks and End-to-End Learning for Audio Compression
ÀúÀÚ(Author) ´Ù´Ï¿¤¶ó ¸²   ÀåÀμ±   ÃÖÈñ¿­   Daniela N. Rim   Inseon Jang   Heeyoul Choi  
¿ø¹®¼ö·Ïó(Citation) VOL 48 NO. 08 PP. 0940 ~ 0946 (2021. 08)
Çѱ۳»¿ë
(Korean Abstract)
´ÜÀÏ µö·¯´× ¸ðµ¨¿¡ ´ëÇÑ ÃÖ±ÙÀÇ ¼º°ú´Â °íµµ·Î ±¸Á¶È­µÈ µ¥ÀÌÅ͸¦ ÇϳªÀÇ ÅëÇÕµÈ ¸ðµ¨·Î ´Ù·ç´Â ÀϵéÀ» °¡´ÉÇÏ°Ô Çß´Ù. ÇÏÁö¸¸, ¿Àµð¿À ½ÅÈ£¸¦ ¾ÐÃàÇϱâ À§ÇÑ ´ÜÀÏ µö·¯´× ¸ðµ¨À» ÇнÀÇÏ´Â °ÍÀº ³»ºÎÀûÀ¸·Î ½ÅÈ£¿¡ ´ëÇØ ÀÌ»êÇ¥ÇöÀ» ÇÊ¿ä·Î Çϱ⠶§¹®¿¡ ¾î·Á¿î ÀÛ¾÷À̾ú´Ù. º» ³í¹®¿¡¼­´Â Àº´Ð°ø°£¿¡ ÀÌ»ê Ç¥ÇöÀ» °¡Áö´Â º¯ÀÌ ¿ÀÅäÀÎÄÚ´õ ÀÇ ÈÆ·Ã Àü·« ³»¿¡¼­ ¼øȯ ½Å°æ¸Á(RNNs)¸¦ °áÇÕÇÏ´Â ´ÜÀϸ𵨠±â¹Ý ½ÉÃþ¸Á ¸ðµ¨°ú ÇнÀ¹æ¹ýÀ» Á¦½ÃÇÑ´Ù. Á¦¾ÈÇÏ´Â ¹æ¹ý¿¡¼­´Â º£¸£´©ÀÌ(Bernoulli) ºÐÆ÷¸¦ À§ÇÑ ÀçÆĶó¹ÌÅÍÈ­ ±â¹ýÀ» »ç¿ëÇÏ¿© ÀÌ»êÇ¥Çö¿¡¼­ ¿ªÀüÆĸ¦ °¡´ÉÇÏ°Ô Çϵµ·Ï ÇÏ¿´À¸¸ç ±× °á°ú ½ÇÁ¦ ¿Àµð¿À ¾ÐÃà¿¡ ÇʼöÀûÀÎ ÀÎÄÚ´õ¿Í µðÄÚ´õ¸¦ ºÐ¸®ÇÒ ¼ö ÀÖ¾ú´Ù. ¿ì¸®°¡ ¾Æ´Â ¹üÀ§¿¡¼­, Á¦¾ÈµÈ ¸ðµ¨Àº ¿Àµð¿À ¾ÐÃàÀ» À§ÇØ RNN¸¦ »ç¿ëÇÑ ´ÜÀϸ𵨠ÇнÀÀÇ ÃÖÃÊÀÇ ±¸ÇöÀ¸·Î½á, 20.53dBÀÇ SDR (½ÅÈ£ ´ë ¿Ö°î ºñÀ²)À» ´Þ¼ºÇÑ´Ù.
¿µ¹®³»¿ë
(English Abstract)
Recent advances in end-to-end deep learning have encouraged the exploration of tasks dealing with highly structured data using unified deep network models. The fabrication and design of such models for compressing audio signals has been a challenge due to the need for discrete representations that are not easy to train with end-to-end backpropagation. In this paper, we present an end-to-end deep learning approach that combines recurrent neural networks (RNNs) within the training strategy of variational autoencoders (VAEs) with a binary representation of the latent space. We apply a reparametrization trick for the Bernoulli distribution for the discrete representations, which allows smooth backpropagation. In addition, our approach enables the separation of the encoder and decoder, which is necessary for compression tasks. To the best of our knowledge, this is the first end-to-end learning for a single audio compression model with RNNs, and our model achieves a Signal to Distortion Ratio (SDR) of 20.53dB.
Å°¿öµå(Keyword) À½Çâ ¾ÐÃà   Á¾´Ü°£ ÇнÀ   ÀÌ»ê »óÅ °ø°£   º¯ÀÌ¿ÀÅäÀÎÄÚ´õ   audio compression   end-to-end learning   discrete latent space   variational autoencoder  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå