• 전체
  • 전자/전기
  • 통신
  • 컴퓨터
닫기

사이트맵

Loading..

Please wait....

영문 논문지

홈 홈 > 연구문헌 > 영문 논문지 > TIIS (한국인터넷정보학회)

TIIS (한국인터넷정보학회)

Current Result Document : 2 / 2

한글제목(Korean Title) A Novel Feature Selection Method in the Categorization of Imbalanced Textual Data
영문제목(English Title) A Novel Feature Selection Method in the Categorization of Imbalanced Textual Data
저자(Author) Jafar Pouramini   Behrouze Minaei-Bidgoli   Mahdi Esmaeili  
원문수록처(Citation) VOL 12 NO. 08 PP. 3725 ~ 3748 (2018. 08)
한글내용
(Korean Abstract)
영문내용
(English Abstract)
Text data distribution is often imbalanced. Imbalanced data is one of the challenges in text classification, as it leads to the loss of performance of classifiers. Many studies have been conducted so far in this regard. The proposed solutions are divided into several general categories, include sampling-based and algorithm-based methods. In recent studies, feature selection has also been considered as one of the solutions for the imbalance problem. In this paper, a novel one-sided feature selection known as probabilistic feature selection (PFS) was presented for imbalanced text classification. The PFS is a probabilistic method that is calculated using feature distribution. Compared to the similar methods, the PFS has more parameters. In order to evaluate the performance of the proposed method, the feature selection methods including Gini, MI, FAST and DFS were implemented. To assess the proposed method, the decision tree classifications such as C4.5 and Naive Bayes were used. The results of tests on Reuters-21875 and WebKB figures per F-measure suggested that the proposed feature selection has significantly improved the performance of the classifiers.
키워드(Keyword) Feature selection   Imbalanced class   High dimensionality   Text classification  
파일첨부 PDF 다운로드