首页> 外文会议>International Conference on Advances in Digital Libraries >Copy detection systems for digital documents
【24h】

Copy detection systems for digital documents

机译:复制数字文档的检测系统

获取原文

摘要

Partial or total duplication of document content is common to large digital libraries. In this paper, we present a copy detection system to automate the detection of duplication in digital documents. The system we present is sentence-based and makes three contributions: it proposes an intuitive definition of similarity between documents; it produces the distribution of overlap that exists between overlapping documents; it is resistant to inaccuracy due to large variations in document size. We report the results of several experiments that illustrate the behavior and functionality of the system.
机译:文档内容的部分或全部重复是大型数字图书馆的常见。在本文中,我们提出了一种复制检测系统,以自动检测数字文档中的重复。我们呈现的系统是基于刑期,并提出三个贡献:它提出了文件之间的相似性的直观定义;它产生重叠文档之间存在的重叠分布;由于文档尺寸的大变化,它抵抗不准确。我们报告了几个实验的结果,说明了系统的行为和功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号