• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

±¹³» ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ±¹³» ³í¹®Áö > Çѱ¹Á¤º¸°úÇÐȸ ³í¹®Áö > Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Á¤º¸°úÇÐȸ ÄÄÇ»ÆÃÀÇ ½ÇÁ¦ ³í¹®Áö (KIISE Transactions on Computing Practices)

Current Result Document : 1 / 1

ÇѱÛÁ¦¸ñ(Korean Title) À½¼º-À½¾Ç È¥Àç µ¥ÀÌÅÍ¿¡¼­ÀÇ À½¼ººÐ¸®¸¦ À§ÇÑ È®·üÀû ¾îÅÙ¼ÇÀ» »ç¿ëÇÑ ¾ç¹æÇâ LSTM ±â¹Ý ÇÇÄ¡ ºÐ·ù
¿µ¹®Á¦¸ñ(English Title) Pitch Classification Based on Bidirectional LSTM with robabilistic Attention for Speech Segregation from Speech-Music Mixtures
ÀúÀÚ(Author) ±èÇѱԠ  Àå±æÁø   ¹ÚÁ¤½Ä   ¿À¿µÈ¯   ÃÖÈ£Áø   Han-Gyu Kim   Gil-Jin Jang   Jeong-Sik Park   Yung-Hwan Oh   Ho-Jin Choi  
¿ø¹®¼ö·Ïó(Citation) VOL 25 NO. 04 PP. 0223 ~ 0230 (2019. 04)
Çѱ۳»¿ë
(Korean Abstract)
Sub-band masking ±â¹Ý ´ÜÀÏä³Î À½¼ººÐ¸®¿¡¼­´Â À½¼ºÇÇÄ¡¸¦ ÃßÁ¤ÇÏ¿© ÃßÁ¤µÈ ÇÇÄ¡¿Í ÀÏÄ¡ÇÏ´Â ÁÖÆļö ¿¡³ÊÁö¸¸ Åë°ú½ÃÅ°´Â ÇÊÅ͸¦ »ç¿ëÇÏ¿© ¹è°æ ÀâÀ½À¸·ÎºÎÅÍ À½¼ºÀ» ºÐ¸®ÇÑ´Ù. À½¼º°ú À½¾ÇÀº ºñ½ÁÇÑ Çϸð´Ð ±¸Á¶¸¦ °¡Áö°í ÀÖ¾î, À½¾ÇÀÌ ÀâÀ½À¸·Î ÀÔ·ÂµÉ °æ¿ì ÃßÁ¤µÈ ÇÇÄ¡¿¡ À½¼º ÇÇÄ¡¿Í À½¾Ç ÇÇÄ¡°¡ °øÁ¸ÇÏ°Ô µÇ¸ç, ÀÌ´Â À½¼ººÐ¸®ÀÇ ¼º´ÉÇ϶ôÀ¸·Î ¿¬°áµÈ´Ù. µû¶ó¼­ À½¼º-À½¾Ç È¥Àç µ¥ÀÌÅÍ¿¡¼­ÀÇ È¿°úÀûÀÎ À½¼ººÐ¸®¸¦ À§ÇØ À½¼º ÇÇÄ¡¿Í À½¾Ç ÇÇÄ¡¸¦ ºÐ·ùÇØ¾ß ÇÑ´Ù. º» ¿¬±¸¿¡¼­´Â ¾ç¹æÇâ LSTMÀ» »ç¿ëÇÏ´Â À½¼º/À½¾Ç ÇÇÄ¡ ºÐ·ù ¹æ¹ýÀ» Á¦¾ÈÇÏ¿´À¸¸ç, ¾ç¹æÇâ LSTMÀÇ ¼º´ÉÀ» Çâ»ó½ÃÅ°±â À§Çؼ­ È®·üÀû ¾îÅÙ¼Ç ·¹ÀÌ¾î ±¸Á¶¸¦ Á¦¾ÈÇÏ¿´´Ù. ¶ÇÇÑ ÇÇÄ¡ ºÐ·ù °á°ú·ÎºÎÅÍ ÀÚ¿¬½º·¯¿î À½¼ººÐ¸® °á°ú¸¦ ¾ò±â À§ÇØ À½¾Ç ¿¡³ÊÁö°¡ Á¦°ÅµÈ À½¼ººÐ¸® ¸¶½ºÅ© »ý¼º ±â¹ýÀ» Á¦¾ÈÇÏ¿´´Ù. ½ÇÇè°á°ú È®·üÀû ¾îÅÙ¼Ç ±â¹Ý ¾ç¹æÇâ LSTMÀÌ ´Ù¸¥ ¹æ¹ý¿¡ ºñÇØ ´õ ÁÁÀº À½¼ººÐ¸® ¼º´ÉÀ» º¸¿©ÁÖ¾ú´Ù.
¿µ¹®³»¿ë
(English Abstract)
Speech segregation based on sub-band masking extracts speech signals from audio mixtures via estimation of speech pitch and conservation of signals compatible with the estimated pitch. As speech and music exhibit similar harmonic structures, speech pitch and music pitch coexist in the estimated pitch when speech-music mixture is used as the input, which leads to performance degradation. In order to overcome this limitation, we propose pitch classification using bidirectional LSTM. The probabilistic attention layer is also proposed to improve the bidirectional LSTM. Further, musical energy removal for segregation mask generation is also proposed in order to obtain naturally segregated speech with pitch classification. The experiment results show that the proposed pitch classification using bidirectional LSTM based on probabilistic attention outscores other speech segregation methods.
Å°¿öµå(Keyword) À½¼ººÐ¸®   ÇÇÄ¡ ºÐ·ù   ¾ç¹æÇâ LSTM   È®·üÀû ¾îÅټǠ  speech segregation   pitch classification   bidirectional LSTM   probabilistic attention  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå