首页> 外文会议>International Conference on Digital Information Management >Algorithm of the longest commonly consecutive word for Plagiarism detection in text based document
【24h】

Algorithm of the longest commonly consecutive word for Plagiarism detection in text based document

机译:基于文本文档中抄袭检测的​​最长常见连续词的算法

获取原文

摘要

Plagiarism is a form of academic misconduct which has increased with the easy access to obtain information through electronic documents and the Internet. The problem of finding document plagiarism in full text document can be viewed as a problem of finding the longest common parts of strings. Moreover, the detection system has to be capable to determine and visualize not only the common parts but also the location of the common parts in both the source and the observed document. Unlike previous research, this paper proposes a numerical based comparison algorithm that is comparable in the computation time without loosing the word order of common parts. Based on the experiment, the proposed algorithm outperforms the suffix tree in the length of observed paragraph below one hundred words.
机译:剽窃是一种学术不当行为的形式,随着通过电子文件和互联网获取信息而增加。在全文文档中查找文档抄袭的问题可以被视为找到字符串最长的常见部分的问题。此外,检测系统必须能够实力地确定和可视化源部和观察文档中的公共部分的位置。与以前的研究不同,本文提出了一种基于数值基于比较算法,其在计算时间中可比,而不减少公共部分的字阶。基于实验,所提出的算法在观察到的段落之后的后缀树上优于一百个单词的长度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号