• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

¿µ¹® ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ¿µ¹® ³í¹®Áö > TIIS (Çѱ¹ÀÎÅͳÝÁ¤º¸ÇÐȸ)

TIIS (Çѱ¹ÀÎÅͳÝÁ¤º¸ÇÐȸ)

Current Result Document : 1 / 1

ÇѱÛÁ¦¸ñ(Korean Title) A Novel Feature Selection Method in the Categorization of Imbalanced Textual Data
¿µ¹®Á¦¸ñ(English Title) A Novel Feature Selection Method in the Categorization of Imbalanced Textual Data
ÀúÀÚ(Author) Jafar Pouramini   Behrouze Minaei-Bidgoli   Mahdi Esmaeili  
¿ø¹®¼ö·Ïó(Citation) VOL 12 NO. 08 PP. 3725 ~ 3748 (2018. 08)
Çѱ۳»¿ë
(Korean Abstract)
¿µ¹®³»¿ë
(English Abstract)
Text data distribution is often imbalanced. Imbalanced data is one of the challenges in text classification, as it leads to the loss of performance of classifiers. Many studies have been conducted so far in this regard. The proposed solutions are divided into several general categories, include sampling-based and algorithm-based methods. In recent studies, feature selection has also been considered as one of the solutions for the imbalance problem. In this paper, a novel one-sided feature selection known as probabilistic feature selection (PFS) was presented for imbalanced text classification. The PFS is a probabilistic method that is calculated using feature distribution. Compared to the similar methods, the PFS has more parameters. In order to evaluate the performance of the proposed method, the feature selection methods including Gini, MI, FAST and DFS were implemented. To assess the proposed method, the decision tree classifications such as C4.5 and Naive Bayes were used. The results of tests on Reuters-21875 and WebKB figures per F-measure suggested that the proposed feature selection has significantly improved the performance of the classifiers.
Å°¿öµå(Keyword) Feature selection   Imbalanced class   High dimensionality   Text classification  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå