• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

¿µ¹® ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ¿µ¹® ³í¹®Áö > JIPS (Çѱ¹Á¤º¸Ã³¸®ÇÐȸ)

JIPS (Çѱ¹Á¤º¸Ã³¸®ÇÐȸ)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) Case-Related News Filtering via Topic-Enhanced Positive-Unlabeled Learning
¿µ¹®Á¦¸ñ(English Title) Case-Related News Filtering via Topic-Enhanced Positive-Unlabeled Learning
ÀúÀÚ(Author) Guanwen Wang   Zhengtao Yu   Yantuan Xian   Yu Zhang  
¿ø¹®¼ö·Ïó(Citation) VOL 17 NO. 6 PP. 1057 ~ 1070 (2021. 12)
Çѱ۳»¿ë
(Korean Abstract)
¿µ¹®³»¿ë
(English Abstract)
Case-related news filtering is crucial in legal text mining and divides news into case-related and case-unrelated categories. Because case-related news originates from various fields and has different writing styles, it is difficult to establish complete filtering rules or keywords for data collection. In addition, the labeled corpus for case-related news is sparse; therefore, to train a high-performance classification model, it is necessary to annotate the corpus. To address this challenge, we propose topic-enhanced positive-unlabeled learning, which selects positive and negative samples guided by topics. Specifically, a topic model based on a variational autoencoder (VAE) is trained to extract topics from unlabeled samples. By using these topics in the iterative process of positive-unlabeled (PU) learning, the accuracy of identifying case-related news can be improved. From the experimental results, it can be observed that the F1 value of our method on the test set is 1.8% higher than that of the PU learning baseline model. In addition, our method is more robust with low initial samples and high iterations, and compared with advanced PU learning baselines such as nnPU and I-PU, we obtain a 1.1% higher F1 value, which indicates that our method can effectively identify case-related news.
Å°¿öµå(Keyword) Case-Related News   Iterative Training   Positive-Unlabeled Learning   Topic  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå