Towards Document Plagiarism Detection Based on the Relevance and Fragmentation of the Reused Text

机译：根据重复文本的相关性和碎片来探讨抄袭检测

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Traditionally, External Plagiarism Detection has been carried out by determining and measuring the similar sections between a given pair of documents, known as source and suspicious documents. One of the main difficulties of this task resides on the fact that not all similar text sections are examples of plagiarism, since thematic coincidences also tend to produce portions of common text. In order to face this problem in this paper we propose to represent the common (possibly reused) text by means of a set of features that denote its relevance and fragmentation. This new representation, used in conjunction with supervised learning algorithms, provides more elements for the automatic detection of document plagiarism; in particular, our experimental results show that it clearly outperformed the accuracy results achieved by traditional n-gram based approaches.

机译：传统上，通过确定和测量给定对文件之间的类似部分，称为源和可疑文件来进行外部抄袭检测。这项任务的主要困难之一存在于事实上，并非所有类似的文本部分都是抄袭的例子，因为专题巧合也倾向于产生共同文本的部分。为了在本文中面对这个问题，我们建议通过一组特征来代表共同的（可能重复使用）文本，该功能表示其相关性和碎片。与监督学习算法结合使用的新表示提供了更多元素用于自动检测文件抄袭;特别是，我们的实验结果表明它显然优于传统的N-GRAM基础方法实现的准确性结果。

著录项

来源
《Mexican International Conference on Artificial Intelligence》|2010年||共8页
会议地点
作者
Fernando Sanchez-Vega; Luis Villasenor-Pineda; Manuel Montes-y-Gomez; Paolo Rosso;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词
Plagiarism detection; Text reuse; Supervised classification;

机译：抄袭检测;文本重用;监督分类;

相似文献

外文文献
中文文献
专利

1. Plagiarism Detection System for Indonesia Text Based Document by Fingerprint Method and Natural Language Processing Approach [J] . Advanced Science Letters . 2016,第10期

机译：指纹方法和自然语言处理方法印度尼西亚文本文档的抄袭检测系统
2. Hybrid Segmentation Prototype for Arabic Text-Based Documents: Towards Plagiarism Detection [J] . Sonia Alouane-Ksouri, Minyar Sassi Hidri International Journal of Service Science, Management, Engineering, and Technology . 2015,第1期

机译：基于阿拉伯文本的文档的混合分割原型：抄袭检测。
3. Determining and characterizing the reused text for plagiarism detection [J] . Fernando Sanchez-Vega, Esau Villatoro-Tello, Manuel Montes-y-Gomez, Expert Systems with Application . 2013,第5期

机译：确定和表征重复使用的文本以进行窃检测
4. Towards Document Plagiarism Detection Based on the Relevance and Fragmentation of the Reused Text [C] . Fernando Sanchez-Vega, Luis Villasenor-Pineda, Manuel Montes-y-Gomez, MICAI 2010;Mexican international conference on artificial intelligence . 2010

机译：基于重用文本的相关性和碎片化的文档gi窃检测
5. Mono- and Cross-Lingual Paraphrased Text Reuse and Extrinsic Plagiarism Detection [D] . Sharjeel, Muhammad. 2020

机译：单次和交叉语言解读文本重用和外在抄袭检测
6. Intelligent Bar Chart Plagiarism Detection in Documents [O] . Mohammed Mumtaz Al-Dabbagh, Naomie Salim, Amjad Rehman, -1

机译：文件中的智能条形图抄袭检测
7. Towards Document Plagiarism Detection Based on the Relevance and Fragmentation of the Reused Text [O] . O Sánchez-vega, Luis Villaseñor-pineda, Manuel Montes-y-gómez, 2010

机译：基于重用文本的相关性和碎片化的文档抄袭检测

Towards Document Plagiarism Detection Based on the Relevance and Fragmentation of the Reused Text

摘要

著录项

相似文献

相关主题

期刊订阅