A Mixed Fuzzy Similarity Approach to Detect Plagiarism in Persian Texts

机译：一种混合模糊相似度检测波斯语文本抄袭的方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A variety of methods and metrics have been offered so far to measure the extent of similarity among various documents and plagiarism detection systems. However, most of them do not take ambiguity inherent in natural language into account. Therefore, in this paper, a new method taking lexical and semantic features and similarity measures into consideration has been proposed. In the first step, after preprocessing and removing stop word, a text was divided into two parts: general and domain-specific knowledge words. Then, the mixed lexical and semantic fuzzy inference system was designed to assess text similarity. The proposed method was evaluated on Persian paper abstracts of International Conference on e-Leaming and e-Teaching (ICELET) Corpus and using IT domain knowledge ontology. The results indicated that the proposed method can achieve a rate of 79% in terms of precision and can detect 83% of the plagiarism cases.

机译：迄今为止，已经提供了多种方法和度量来测量各种文档和抄袭检测系统之间的相似程度。但是，它们中的大多数并没有考虑自然语言固有的歧义性。因此，本文提出了一种新的方法，该方法考虑了词法和语义特征以及相似性度量。第一步，在预处理和删除停用词之后，将文本分为两部分：常规知识词和特定领域知识词。然后，设计了混合的词法和语义模糊推理系统来评估文本的相似性。在国际电子学习和电子教学会议（ICELET）语料库的波斯论文摘要上，并使用IT领域知识本体，对提出的方法进行了评估。结果表明，所提方法在查抄率方面可达到79％的准确率，可检出83％的窃病例。

著录项

来源
《International work-conference on artificial neural networks》|2015年|525-534|共10页
会议地点
作者
Hamid Ahangarbahan; Gholam Ali Montazer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Plagiarism; Similarity metric; Fuzzy sets; Semantic similarity; Lexical similarity;

机译：Pla窃;相似度指标;模糊集;语义相似度;词汇相似度;

相似文献

外文文献
中文文献
专利

1. Detecting Plagiarism in Arabic E-Learning Using Text Mining [J] . Farahat F. Farahat, Aziza S. Asem, Mahmoud A. Zaher, British Journal of Mathematics & Computer Science . 2015,第4期

机译：使用文本挖掘检测阿拉伯语电子学习中的抄袭
2. Text mining applied to plagiarism detection: The use of words for detecting deviations in the writing style [J] . Gabriel Oberreuter, Juan D. Velasquez Expert Systems with Application . 2013,第9期

机译：文本挖掘应用于窃检测：使用单词来检测写作风格中的偏差
3. The Use of Hartigan Index for Initializing K-Means++ in Detecting Similar Texts of Clustered Documents as a Plagiarism Indicator [J] . Diana Purwitasari, I. Wayan Surya Priantara, Putu Yuwono Kusmawan, Asian Journal of Information Technology . 2011,第8期

机译：使用Hartigan索引初始化K-Means ++来检测聚类文档的相似文本作为gi窃指示符
4. A Mixed Fuzzy Similarity Approach to Detect Plagiarism in Persian Texts [C] . Hamid Ahangarbahan, Gholam Ali Montazer International World-Conference on Artificial Neural Networks . 2015

机译：一种混合模糊相似性方法，以检测波斯文本抄袭
5. An Automatic Similarity Detection Engine Between Sacred Texts Using Text Mining and Similarity Measures [D] . Qahl, Salha Hassan Muhammed. 2014

机译：使用文本挖掘和相似度度量的神圣文本之间的自动相似度检测引擎
6. A text mining approach to detect mentions of protein glycosylation in biomedical text [O] . Daksha Shukla, Valadi K Jayaraman 2012

机译：一种文本挖掘方法用于检测生物医学文本中蛋白质糖基化的提及
7. A Fuzzy Similarity Approach in Text Classification Task [O] . Dwi H. Widyantoro, John Yen 2008

机译：文本分类任务中的模糊相似度方法

A Mixed Fuzzy Similarity Approach to Detect Plagiarism in Persian Texts

摘要

著录项

相似文献

相关主题

期刊订阅