首页> 外文会议>International work-conference on artificial neural networks >A Mixed Fuzzy Similarity Approach to Detect Plagiarism in Persian Texts
【24h】

A Mixed Fuzzy Similarity Approach to Detect Plagiarism in Persian Texts

机译:一种混合模糊相似度检测波斯语文本抄袭的方法

获取原文

摘要

A variety of methods and metrics have been offered so far to measure the extent of similarity among various documents and plagiarism detection systems. However, most of them do not take ambiguity inherent in natural language into account. Therefore, in this paper, a new method taking lexical and semantic features and similarity measures into consideration has been proposed. In the first step, after preprocessing and removing stop word, a text was divided into two parts: general and domain-specific knowledge words. Then, the mixed lexical and semantic fuzzy inference system was designed to assess text similarity. The proposed method was evaluated on Persian paper abstracts of International Conference on e-Leaming and e-Teaching (ICELET) Corpus and using IT domain knowledge ontology. The results indicated that the proposed method can achieve a rate of 79% in terms of precision and can detect 83% of the plagiarism cases.
机译:迄今为止,已经提供了多种方法和度量来测量各种文档和抄袭检测系统之间的相似程度。但是,它们中的大多数并没有考虑自然语言固有的歧义性。因此,本文提出了一种新的方法,该方法考虑了词法和语义特征以及相似性度量。第一步,在预处理和删除停用词之后,将文本分为两部分:常规知识词和特定领域知识词。然后,设计了混合的词法和语义模糊推理系统来评估文本的相似性。在国际电子学习和电子教学会议(ICELET)语料库的波斯论文摘要上,并使用IT领域知识本体,对提出的方法进行了评估。结果表明,所提方法在查抄率方面可达到79%的准确率,可检出83%的窃病例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号