Comparison of text-based methods for detecting duplication in document image databases

机译：基于文本的在文档图像数据库中检测重复的方法的比较

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Abstract: This paper presents an experimental evaluation of several text-based methods for detecting duplication in document image databases using uncorrected OCR output. This task is challenging because of both the wide range of degradations printed documents can suffer, and conflicting interpretations of what it means to be a 'duplicate.' We report results for five sets of experiments exploring various aspects of the problem space. While the techniques studied are generally robust in the face of most types of OCR errors, there are nonetheless important differences which we identify and discuss in detail. !15

机译：摘要：本文提出了一种实验评估，该方法评估了几种使用未经校正的OCR输出检测文本图像数据库中重复项的基于文本的方法。这项任务具有挑战性，因为打印文档可能会遭受各种各样的降级，而且对“重复”的含义的解释也相互矛盾。我们报告了探索问题空间各个方面的五组实验的结果。尽管面对大多数类型的OCR错误，所研究的技术通常都非常可靠，但仍然存在一些重要的区别，我们将进行详细介绍和讨论。！15

著录项

来源
《Document Recognition and Retrieval VII》|2000年|p.210-221|共12页
会议地点 San Jose CA(US)
作者
Daniel P. Lopresti; Lucent Technologies/Bell Labs.; Murray Hill; NJ; USA.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类 TN9;
关键词

相似文献

外文文献
中文文献
专利

1. Detecting duplicates among symbolically compressed images in a large document database [J] . Dar-Shyang Lee, Jonathan J. Hull Pattern recognition letters . 2001,第5期

机译：在大型文档数据库中检测符号压缩图像之间的重复项
2. String techniques for detecting duplicates in document database [J] . Daniel P.Lopresti International Journal on Document Analysis and Recognition . 2000,第4期

机译：用于检测文档数据库中重复项的字符串技术
3. The detection of duplicates in document image databases [J] . David Doermann, Huiping Li, Omid Kia Image and Vision Computing . 1998,第12a13期

机译：在文档图像数据库中检测重复项
4. Comparison of text-based methods for detecting duplication in document image databases [C] . Daniel P. Lopresti Conference on document recognition and retrieval . 2000

机译：基于文本的文本映像数据库重复方法的比较
5. Evaluation of text-based and image-based representations for moving image documents. [D] . Goodrum, Abby Ann. 1997

机译：评估运动图像文档的基于文本和基于图像的表示形式。
6. Prospective monitoring of imaging guideline adherence by physicians in a surgical collaborative: comparison of statistical process control methods for detecting outlying performance [O] . Michael Inadomi, Karandeep Singh, Ji Qi, 2020

机译：医师在外科手术协作中对影像学指南的依从性进行前瞻性监测：比较统计过程控制方法以检测异常表现
7. The Detection of Duplicates in Document Image Databases [O] . David Doermann, Huiping Li, Omid Kia, 1997

机译：文档图像数据库中重复项的检测
8. Detection of Duplicates in Document Image Databases. [R] . Doermann, D., Li, H., Kia, O., 1997

机译：检测文档图像数据库中的重复项。

Comparison of text-based methods for detecting duplication in document image databases

摘要

著录项

相似文献

相关主题

期刊订阅