OCR-Free Table of Contents Detection in Urdu Books

机译：URDU书籍中无OCR的目录检测表

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Table of Contents (ToC) is an integral part of multiple-page documents like books, magazines, etc. Most of the existing techniques use textual similarity for automatically detecting ToC pages. However, such techniques may not be applied for detection of ToC pages in situations where OCR technology is not available, which is indeed true for historical documents and many modern Nabataean (Arabic) and Indic scripts. It is, therefore, necessary to develop tools to navigate through such documents without the use of OCR. This paper reports a preliminary effort to address this challenge. The proposed algorithm has been applied to find Table of Contents (ToC) pages in Urdu books and an overall initial accuracy of 88% has been achieved.

机译：目录（TOC）是多页文件的一个组成部分，如书籍，杂志等。大多数现有技术都使用文本相似性来自动检测到TOC页面。然而，这种技术可能不应用于在OCR技术不可用的情况下检测TOC页面，这对于历史文档和许多现代的Nabataean（阿拉伯语）和指示脚本来说是真的。因此，在不使用OCR使用的情况下开发工具以浏览此类文档所必需的。本文报告了解决这一挑战的初步努力。所提出的算法已应用于查找核武器书籍中的内容（TOC）页面，并且已经实现了88％的整体初始准确性。

著录项

来源
《IAPR International Workshop on Document Analysis Systems》|2012年||共5页
会议地点
作者
Ul-Hasan A.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391-53;
关键词

相似文献

外文文献
中文文献
专利

1. Assessing books' depth and breadth via multi-level mining on tables of contents [J] . Zhang Chengzhi, Zhou Qingqing Journal of informetrics . 2020,第2期

机译：通过对内容表的多级挖掘评估书籍的深度和广度
2. A method for automatic analysis Table of Contents in Chinese books [J] . Chen Jing, Lu Quan Library hi tech . 2015,第3期

机译：一种自动分析中文图书目录的方法
3. Online tables of contents for books: effect on usage. [J] . Morris RC CIM Bulletin . 2001,第1期

机译：书籍的在线目录：对用法的影响。
4. OCR-Free Table of Contents Detection in Urdu Books [C] . Ul-Hasan A. Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on . 2012

机译：乌尔都语书籍中无OCR的目录检测
5. Online tables of contents for books: The user's perspective. [D] . Morris, Ruth C. 2001

机译：书籍的在线目录：用户的观点。
6. Online tables of contents for books: effect on usage [O] . Ruth C. Morris 2001

机译：书籍的在线目录：对用法的影响
7. OCR-Free Table of Contents Detection in Urdu Books [O] . 2015

机译：乌尔都语书籍中无OCR目录检测
8. Hydromechanics and Heat and Mass Exchange in Weightlessness (Russian Book): Table of Contents [R] . Avduyevskiy, V. S., Poleshayev, V. I. 1983

机译：失重的流体力学和热量与质量交换（俄语书）：目录

OCR-Free Table of Contents Detection in Urdu Books

摘要

著录项

相似文献

相关主题

期刊订阅