首页> 外文会议>IAPR International Conference on Document Analysis and Recognition >Page Retrieval System in Digitized Historical Books Based on Error-tolerant Subgraph Matching
【24h】

Page Retrieval System in Digitized Historical Books Based on Error-tolerant Subgraph Matching

机译:基于差错子图匹配的数字化历史书籍页面检索系统

获取原文
获取外文期刊封面目录资料

摘要

Developing smart ways of interacting with scanners is one of the emerging needs identified by numerous digitization professionals. To achieve better interaction with scanners, the research community in historical document image analysis is particularly interested in providing reliable tools for computer-aided indexing and retrieval of historical document images. Thus, we propose in this article a method able to retrieve from a digitized historical book, pages having layout and/or content which meet the user-defined query. Amongst the user-defined queries we focus on the transition pages (e.g. title pages of chapter, end-of-chapter and end-of-act) and pages containing a particular content component or a group of patterns (e.g. ornaments, illustrations and drop caps) in our work. The method adopted in this work is firstly based on using low-level features (texture, shape and geometric descriptors) to represent each page in the form of a graph-based signature. Then, a set of costs is estimated using an error-tolerant subgraph isomorphism algorithm in order to measure the similarity between the user-defined query formulated in terms of a pattern graph and the different subgraphs of the book page signatures and to find book pages similar to the user-defined query. To illustrate the effectiveness of the proposed method, a thorough experimental study has been conducted with quantitative observations obtained from a large number of queries having different contents and structures.
机译:开发与扫描仪交互的智能方式是由众多数字化专业人员确定的新兴需求之一。为了实现与扫描仪更好的互动,研究社区在历史文档图像分析中对提供可靠的计算机辅助索引和检索历史文档图像的可靠工具特别感兴趣。因此,我们提出了本文能够从数字化历史书中检索的方法,具有满足用户定义查询的布局和/或内容的页面。在用户定义的查询中,我们专注于转换页面(例如章节的标题页,章节结束和ACT结束)和包含特定内容组件或一组模式的页面(例如装饰品,插图和丢弃帽子)在我们的工作中。本工作中采用的方法首先基于使用低级功能(纹理,形状和几何描述符)来表示基于图形的签名形式的每个页面。然后,使用差错子图同样估计一组成本,以便测量根据图案图和书籍页面签名的不同子图的用户定义查询之间的相似性,并查找类似的书籍页面到用户定义的查询。为了说明所提出的方法的有效性,已经通过从具有不同内容和结构的大量查询获得的定量观察来进行彻底的实验研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号