• Àüü
  • ÀüÀÚ/Àü±â
  • Åë½Å
  • ÄÄÇ»ÅÍ
´Ý±â

»çÀÌÆ®¸Ê

Loading..

Please wait....

¿µ¹® ³í¹®Áö

Ȩ Ȩ > ¿¬±¸¹®Çå > ¿µ¹® ³í¹®Áö > TIIS (Çѱ¹ÀÎÅͳÝÁ¤º¸ÇÐȸ)

TIIS (Çѱ¹ÀÎÅͳÝÁ¤º¸ÇÐȸ)

Current Result Document :

ÇѱÛÁ¦¸ñ(Korean Title) The Sequence Labeling Approach for Text Alignment of Plagiarism Detection
¿µ¹®Á¦¸ñ(English Title) The Sequence Labeling Approach for Text Alignment of Plagiarism Detection
ÀúÀÚ(Author) Leilei Kong   Zhongyuan Han   Haoliang Qi  
¿ø¹®¼ö·Ïó(Citation) VOL 13 NO. 09 PP. 4814 ~ 4832 (2019. 09)
Çѱ۳»¿ë
(Korean Abstract)
¿µ¹®³»¿ë
(English Abstract)
Plagiarism detection is increasingly exploiting text alignment. Text alignment involves extracting the plagiarism passages in a pair of the suspicious document and its source document. The heuristics have achieved excellent performance in text alignment. However, the further improvements of the heuristic methods mainly depends more on the experiences of experts, which makes the heuristics lack of the abilities for continuous improvements. To address this problem, machine learning maybe a proper way. Considering the position relations and the context of text segments pairs, we formalize the text alignment task as a problem of sequence labeling, improving the current methods at the model level. Especially, this paper proposes to use the probabilistic graphical model to tag the observed sequence of pairs of text segments. Hence we present the sequence labeling approach for text alignment in plagiarism detection based on Conditional Random Fields. The proposed approach is evaluated on the PAN@CLEF 2012 artificial high obfuscation plagiarism corpus and the simulated paraphrase plagiarism corpus, and compared with the methods achieved the best performance in PAN@CLEF 2012, 2013 and 2014. Experimental results demonstrate that the proposed approach significantly outperforms the state of the art methods.
Å°¿öµå(Keyword) Plagiarism Detection   Text Alignment   Sequence Labeling   Probabilistic Graphical Model   Conditional Random Fields  
ÆÄÀÏ÷ºÎ PDF ´Ù¿î·Îµå