Á¤º¸Ã³¸®ÇÐȸ ³í¹®Áö ¼ÒÇÁÆ®¿þ¾î ¹× µ¥ÀÌÅÍ °øÇÐ
ÇѱÛÁ¦¸ñ(Korean Title) |
ºÒ±ÕÇü µ¥ÀÌÅÍ ºÐ·ù¸¦ À§ÇÑ µö·¯´× ±â¹Ý ¿À¹ö»ùÇøµ ±â¹ý |
¿µ¹®Á¦¸ñ(English Title) |
A Deep Learning Based Over-Sampling Scheme for Imbalanced Data Classification |
ÀúÀÚ(Author) |
¼Õ¹ÎÀç
Á¤½Â¿ø
ȲÀÎÁØ
Son Min Jae
Jung Seung Won
Hwang Een Jun
|
¿ø¹®¼ö·Ïó(Citation) |
VOL 08 NO. 07 PP. 0311 ~ 0316 (2019. 07) |
Çѱ۳»¿ë (Korean Abstract) |
ºÐ·ù ¹®Á¦´Â ÁÖ¾îÁø ÀÔ·Â µ¥ÀÌÅÍ¿¡ ´ëÇØ ÇØ´ç µ¥ÀÌÅÍÀÇ Å¬·¡½º¸¦ ¿¹ÃøÇÏ´Â ¹®Á¦·Î, ÀÚÁÖ ¾²ÀÌ´Â ¹æ¹ý ÁßÀÇ Çϳª´Â ÁÖ¾îÁø µ¥ÀÌÅͼÂÀ» »ç¿ëÇÏ¿© ±â°èÇнÀ ¾Ë°í¸®ÁòÀ» ÇнÀ½ÃÅ°´Â °ÍÀÌ´Ù. ÀÌ·± °æ¿ì ºÐ·ùÇÏ°íÀÚ Çϴ Ŭ·¡½º¿¡ µû¸¥ µ¥ÀÌÅÍÀÇ ºÐÆ÷°¡ ±ÕÀÏÇÑ µ¥ÀÌÅͼÂÀÌ ÀÌ»óÀûÀÌÁö¸¸, ºÒ±ÕÇüÇÑ ºÐÆ÷¸¦ °¡Áö°í °æ¿ì Á¦´ë·Î ºÐ·ùÇÏÁö ¸øÇÏ´Â ¹®Á¦°¡ ¹ß»ýÇÑ´Ù. ÀÌ·¯ÇÑ ¹®Á¦¸¦ ÇØ°áÇϱâ À§ÇØ º» ³í¹®¿¡¼´Â Conditional Generative Adversarial Networks(CGAN)À» È°¿ëÇÏ¿© µ¥ÀÌÅÍ ¼öÀÇ ±ÕÇüÀ» ¸ÂÃß´Â ¿À¹ö»ùÇøµ ±â¹ýÀ» Á¦¾ÈÇÑ´Ù. CGANÀº Generative Adversarial Networks(GAN)¿¡¼ ÆÄ»ýµÈ »ý¼º ¸ðµ¨·Î, µ¥ÀÌÅÍÀÇ Æ¯Â¡À» ÇнÀÇÏ¿© ½ÇÁ¦ µ¥ÀÌÅÍ¿Í À¯»çÇÑ µ¥ÀÌÅ͸¦ »ý¼ºÇÒ ¼ö ÀÖ´Ù. µû¶ó¼ CGANÀÌ µ¥ÀÌÅͼö°¡ ÀûÀº Ŭ·¡½ºÀÇ µ¥ÀÌÅ͸¦ ÇнÀÇÏ°í »ý¼ºÇÔÀ¸·Î½á ºÒ±ÕÇüÇÑ Å¬·¡½º ºñÀ²À» ¸ÂÃß¾î ÁÙ ¼ö ÀÖÀ¸¸ç, ±×¿¡ µû¶ó ºÐ·ù ¼º´ÉÀ» ³ôÀÏ ¼ö ÀÖ´Ù. ½ÇÁ¦ ¼öÁýµÈ µ¥ÀÌÅ͸¦ ÀÌ¿ëÇÑ ½ÇÇèÀ» ÅëÇØ CGANÀ» È°¿ëÇÑ ¿À¹ö»ùÇøµ ±â¹ýÀÌ È¿°ú°¡ ÀÖÀ½À» º¸ÀÌ°í ±âÁ¸ ¿À¹ö»ùÇøµ ±â¹ýµé°ú ºñ±³ÇÏ¿© ±âÁ¸±â¹ýµéº¸´Ù ¿ì¼öÇÔÀ» ÀÔÁõÇÏ¿´´Ù.
|
¿µ¹®³»¿ë (English Abstract) |
Classification problem is to predict the class to which an input data belongs. One of the most popular methods to do this is training a machine learning algorithm using the given dataset. In this case, the dataset should have a well-balanced class distribution for the best performance. However, when the dataset has an imbalanced class distribution, its classification performance could be very poor. To overcome this problem, we propose an over-sampling scheme that balances the number of data by using Conditional Generative Adversarial Networks (CGAN). CGAN is a generative model developed from Generative Adversarial Networks (GAN), which can learn data characteristics and generate data that is similar to real data. Therefore, CGAN can generate data of a class which has a small number of data so that the problem induced by imbalanced class distribution can be mitigated, and classification performance can be improved. Experiments using actual collected data show that the over-sampling technique using CGAN is effective and that it is superior to existing over-sampling techniques.
|
Å°¿öµå(Keyword) |
ºÒ±ÕÇü µ¥ÀÌÅÍ
CGAN
µö·¯´×
¿À¹ö»ùÇøµ
Imbalanced Data
CGAN
Deep Learning
Over-Sampling
|
ÆÄÀÏ÷ºÎ |
PDF ´Ù¿î·Îµå
|