...
首页> 外文期刊>Expert Systems with Application >Detection of idea plagiarism using syntax-Semantic concept extractions with genetic algorithm
【24h】

Detection of idea plagiarism using syntax-Semantic concept extractions with genetic algorithm

机译:使用遗传算法的语法-语义概念提取来检测思想窃

获取原文
获取原文并翻译 | 示例

摘要

Plagiarism is increasingly becoming a major issue in the academic and educational domains. Automated and effective plagiarism detection systems are direly required to curtail this information breach, especially in tackling idea plagiarism. The proposed approach is aimed to detect such plagiarism cases, where the idea of a third party is adopted and presented intelligently so that at the surface level, plagiarism cannot be unmasked. The reported work aims to explore syntax-semantic concept extractions with genetic algorithm in detecting cases of idea plagiarism. The work mainly focuses on idea plagiarism where the source ideas are plagiarized and represented in a summarized form. Plagiarism detection is employed at both the document and passage levels by exploiting the document concepts at various structural levels. Initially, the idea embedded within the given source document is captured using sentence level concept extraction with genetic algorithm. Document level detection is facilitated with word-level concepts where syntactic information is extracted and the non-plagiarized docuinents are pruned. A combined similarity metric that utilizes the semantic level concept extraction is then employed for passage level detection. The proposed approach is tested on PAN13-14(1)plagiarism corpus for summary obfuscation data, which represents a challenging case of idea plagiarism. The performance of the current approach and its variations are evaluated both at the document and passage levels, using information retrieval and PAN plagiarism measures respectively. The results are also compared against six top ranked plagiarism detection systems submitted as a part of PAN13-14 competition. The results obtained are found to exhibit significant improvement over the compared systems and hence reflects the potency of the proposed syntax-semantic based concept extractions in detecting idea plagiarism. (C) 2016 Elsevier Ltd. All rights reserved.
机译:gi窃正日益成为学术和教育领域的主要问题。迫切需要自动化且有效的窃检测系统来减少此信息泄露,尤其是在解决想法idea窃方面。所提出的方法旨在检测此类gi窃案,在这种情况下,采用了第三方的想法并以明智的方式提出了这样的建议,因此从表面上看,cannot窃行为不会被掩盖。报告的工作旨在探索用遗传算法提取语法语义概念,以发现思想idea窃的案例。作品主要集中于idea窃思想,其中where窃源思想并以概括形式表示。通过在各种结构级别上利用文档概念,在文档和段落级别均采用passage窃检测。最初,使用遗传算法使用句子级别的概念提取来捕获嵌入到给定源文档中的思想。单词级别的概念有助于文档级别的检测,其中提取了句法信息并修剪了非抄袭的文档。然后将利用语义级别概念提取的组合相似性度量用于通过级别检测。拟议的方法已在PAN13-14(1)抄袭语料库上进行了测试,以获取摘要的混淆数据,这代表了构思窃的挑战性案例。分别使用信息检索和PAN抄袭措施在文档和段落级别评估当前方法的性能及其变化。还将结果与PAN13-14竞赛的一部分提交的六个顶级窃检测系统进行了比较。发现所获得的结果与比较的系统相比显示出显着的改进,因此反映了所提出的基于语法-语义的概念提取在检测思想idea窃中的作用。 (C)2016 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号