Evaluation of State-of-the-Art Paraphrase Identification and Its Application to Automatic Plagiarism Detection

Altheneyan Alaa; Menai Mohamed El Bachir

首页> 外文期刊>International Journal of Pattern Recognition and Artificial Intelligence >Evaluation of State-of-the-Art Paraphrase Identification and Its Application to Automatic Plagiarism Detection

【24h】

Evaluation of State-of-the-Art Paraphrase Identification and Its Application to Automatic Plagiarism Detection

机译：评估最先进的解释鉴定及其在自动抄袭检测中的应用

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Paraphrase identification is a natural language processing (NLP) problem that involves the determination of whether two text segments have the same meaning. Various NLP applications rely on a solution to this problem, including automatic plagiarism detection, text summarization, machine translation (MT), and question answering. The methods for identifying paraphrases found in the literature fall into two main classes: similarity-based methods and classification methods. This paper presents a critical study and an evaluation of existing methods for paraphrase identification and its application to automatic plagiarism detection. It presents the classes of paraphrase phenomena, the main methods, and the sets of features used by each particular method. All the methods and features used are discussed and enumerated in a table for easy comparison. Their performances on benchmark corpora are also discussed and compared via tables. Automatic plagiarism detection is presented as an application of paraphrase identification. The performances on benchmark corpora of existing plagiarism detection systems able to detect paraphrases are compared and discussed. The main outcome of this study is the identification of word overlap, structural representations, and MT measures as feature subsets that lead to the best performance results for support vector machines in both paraphrase identification and plagiarism detection on corpora. The performance results achieved by deep learning techniques highlight that these techniques are the most promising research direction in this field.

机译：解释识别是一种自然语言处理（NLP）问题，涉及确定两个文本段是否具有相同的含义。各种NLP应用程序依赖于解决此问题的解决方案，包括自动抄袭检测，文本摘要，机器翻译（MT）和问题应答。识别文献中发现的释义的方法属于两个主要类：基于相似性的方法和分类方法。本文提出了对现有方法进行了关键研究，并评估了对自动抄袭检测的现有方法。它介绍了每种特定方法使用的释义现象，主要方法和特征集的类。使用的所有方法和功能都在表中讨论和枚举，以便于比较。他们在基准语料库上的表演也通过表进行了讨论和比较。自动抄袭检测作为解释鉴定的应用。比较和讨论了能够检测释义的现有抄袭检测系统的基准语料的表演。本研究的主要结果是识别单词重叠，结构表示和MT措施，作为特征子集，这导致支持向量机器的最佳性能结果，在语料库中的识别和抄袭检测中。深度学习技术实现的性能结果突出显示这些技术是该领域最有前景的研究方向。

著录项

来源
《International Journal of Pattern Recognition and Artificial Intelligence》 |2020年第4期|2053004.1-2053004.31|共31页
作者
Altheneyan Alaa; Menai Mohamed El Bachir;
展开▼
作者单位

King Saud Univ Dept Comp Sci Coll Comp & Informat Sci POB 89638 Riyadh Saudi Arabia;

King Saud Univ Dept Comp Sci Coll Comp & Informat Sci POB 89638 Riyadh Saudi Arabia;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Paraphrase identification; similarity-based methods; classification methods; deep learning; automatic plagiarism detection;

机译：解释识别;基于相似之实的方法;分类方法;深入学习;自动抄袭检测;

相似文献

外文文献
中文文献
专利

1. Plagiarism Meets Paraphrasing: Insights for the Next Generation in Automatic Plagiarism Detection [J] . Alberto Barrón-Cede?, Marta Vil, M. Antònia Mart, Computational linguistics . 2013,第4期

机译：gi窃与释义：下一代自动Pla窃检测的见解
2. Evaluating the state-of-the-art in automatic de-identification. [J] . Uzuner O, Luo Y, Szolovits P Journal of the American Medical Informatics Association : . 2007,第5期

机译：评估自动去识别的最新技术。
3. Automatic player detection and identification for sports entertainment applications [J] . Mahmood Zahid, Ali Tauseef, Khattak Shahid, Pattern Analysis and Applications . 2015,第4期

机译：用于体育娱乐应用的自动播放器检测和识别
4. A Graph Based Automatic Plagiarism Detection Technique to Handle Artificial Word Reordering and Paraphrasing [C] . Niraj Kumar International conference on intelligent text processing and computational linguistics . 2014

机译：一种基于图的自动抄袭检测技术，用于处理人工单词重排和释义
5. Automatic speech code identification with application to tampering detection of speech recordings. [D] . Zhou, Jingting. 2011

机译：自动语音代码识别，可用于篡改语音记录。
6. Evaluating the State-of-the-Art in Automatic De-identification [O] . Özlem Uzuner, Yuan Luo, Peter Szolovits 2007

机译：评估自动取消身份识别的最新技术
7. Plagiarism meets paraphrasing: insights for the new generation in automatic plagiarism detection [O] . Barrón-Cedeño Alberto, Vila Rigat Marta, Martí Antonin M. Antònia, 2014

机译：窃遇上释义：新一代自动detection窃检测的见解
8. Assessment of the Application of Automatic Vehicle Identification Technology to Traffic Management. Appendix B:Evaluation of Potential Applications of Automatic Vehicle Identification to Traffice Management. [R] . ferlis, r. a. aaron, r. 1978

机译：自动车辆识别技术在交通管理中的应用评估。附录B：评估自动车辆识别在交通管理中的潜在应用。

Evaluation of State-of-the-Art Paraphrase Identification and Its Application to Automatic Plagiarism Detection

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅