Content-based indexing and retrieval method of Chinese document images

机译：基于内容的中文文档图像的索引和检索方法

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In Chinese information retrieval, it is easy to index a Chinese text document for retrieval. We just need to segment the text document into phrases. When the document is a Chinese document image (non-ASCII file), we may first convert the document image into the text file by using Chinese optical character recognition (OCR) technology and then index the document by using an information retrieval algorithm. However, OCR needs more time, which can influence retrieval efficiency. This paper proposes an index method based on stroke density code. First segment the document image to get all the Chinese character images, then calculate the stroke density of each Chinese character image, and at last attain the stroke density code of the character image. The index method has the advantage of speed and robustness to noise. In addition, this paper also offers a retrieval method for Chinese document images based on the index technology. We discuss the index and retrieval method for duplicate detection. We have proved the validity of the index method through its application to keyword spotting and duplicate detection.

机译：在中文信息检索中，很容易索引中文文本文档进行检索。我们只需要将文本文档分段为短语。当文档是中文文档图像（非ASCII文件）时，我们可以首先通过使用汉字光学字符识别（OCR）技术将文档图像转换为文本文件，然后使用信息检索算法索引文档。但是，OCR需要更多的时间，这可以影响检索效率。本文提出了一种基于行程密度代码的索引方法。首先分段文档图像以获取所有中文字符图像，然后计算每个汉字图像的行程密度，最后达到字符图像的行程密度代码。索引方法具有速度和稳健性的优点。此外，本文还提供了基于索引技术的中式文档图像的检索方法。我们讨论重复检测的索引和检索方法。我们通过应用于关键字发现和重复检测，我们已经证明了索引方法的有效性。

著录项

来源
《International Conference on Document Analysis and Recognition》|1999年||共4页
会议地点
作者
Yaodong He; Zao Jiang; Institute of Electric and Electronic Engineer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. An efficient indexing method for content-based image retrieval [J] . Deying Feng, Jie Yang, Congxin Liu Neurocomputing . 2013,第apra15期

机译：一种基于内容的图像检索的高效索引方法
2. An efficient high-dimensional indexing method for content-based retrieval in large image databases [J] . I. Daoudi, K. Idrissi, S. E. Ouatik, Signal Processing. Image Communication: A Publication of the the European Association for Signal Processing . 2009,第10期

机译：用于大型图像数据库中基于内容的检索的高效高维索引方法
3. Matching Word Images For Content-based Retrieval From Printed Document Images [J] . Million Meshesha, C. V. Jawahar International Journal on Document Analysis and Recognition . 2008,第1期

机译：匹配Word图像以从打印的文档图像中进行基于内容的检索
4. Content-based indexing and retrieval method of Chinese document images [C] . Yaodong He, Zao Jiang . 1999

机译：基于内容的中文文档图像索引与检索方法
5. Content-based handwritten document indexing and retrieval. [D] . Huang, Chen. 2008

机译：基于内容的手写文档索引和检索。
6. Determining similarity in histological images using graph-theoretic description and matching methods for content-based image retrieval in medical diagnostics [O] . Harshita Sharma, Alexander Alekseychuk, Peter Leskovsky, 2012

机译：使用图论描述和匹配方法在医学诊断中基于内容的图像检索中确定组织学图像的相似性
7. An intelligent agent for content-based indexing and retrieval of documents [O] . Mimouni, NK, Marir, F, Meziane, F 100

机译：基于内容的索引和文档检索的智能代理
8. System for Indexing Multi-Spectral Satellite Images for Efficient Content-Based Retrieval [R] . Barros, J., French, J., Martin, W., 2003

机译：用于高效内容检索的多光谱卫星图像索引系统

Content-based indexing and retrieval method of Chinese document images

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅