首页> 外文会议>International conference on future data and security engineering >An Approach for Plagiarism Detection in Learning Resources
【24h】

An Approach for Plagiarism Detection in Learning Resources

机译:学习资源中的Pla窃检测方法

获取原文

摘要

Plagiarism detection problem has been taken into account both individuals and organizations. This problem can be used to detect the copy of documents, e.g., publications, books, theses, and more. There are many approaches that have been proposed for plagiarism detection and they work well for English. Different countries may use different languages, thus, natural language processing (e.g. processing of acute accent, circumflex accent, etc.) as well as semantic or order of the words are still challenging. This work proposes an approach for plagiarism detection, especially for Vietnamese documents in learning/researching resources. The input data were pre-processed, extracted, vectorized and represented in term of TF-IDF. Then, Cosine similarity and word-order similarity of the documents are computed. Finally, an ensemble of these similarities is combined. Experimental results on a Vietnamese journal dataset show that the proposed approach is feasibility.
机译:抄袭检测问题已被考虑到个人和组织。此问题可用于检测文档的副本,例如出版物,书籍,论文等等。已经提出了许多用于窃检测的方法,它们对于英语非常有效。不同的国家可能使用不同的语言,因此,自然语言处理(例如,处理重音,抑扬音等)以及单词的语义或顺序仍然具有挑战性。这项工作提出了一种窃检测的方法,特别是对于学习/研究资源中的越南文献。输入数据经过预处理,提取,矢量化并以TF-IDF表示。然后,计算文档的余弦相似度和词序相似度。最后,将这些相似之处整合在一起。在越南期刊数据集上的实验结果表明,该方法是可行的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号