首页> 外文会议>IAPR International Workshop on Document Analysis Systems >Computerized Counting of Individuals in Ottoman Population Registers with Deep Learning
【24h】

Computerized Counting of Individuals in Ottoman Population Registers with Deep Learning

机译:深度学习奥斯曼人口登记册中个人的计算机计数

获取原文

摘要

The digitalization of historical documents continues to gain pace for further processing and extract meanings from these documents. Page segmentation and layout analysis are crucial for historical document analysis systems. Errors in these steps will create difficulties in the information retrieval processes. Degradation of documents, digitization errors and varying layout styles complicate the segmentation of historical documents. The properties of Arabic scripts such as connected letters, ligatures, diacritics and different writing styles make it even more challenging to process Arabic historical documents. In this study, we developed an automatic system for counting registered individuals and assigning them to populated places by using a CNN-based architecture. To evaluate the performance of our system, we created a labeled dataset of registers obtained from the first wave of population registers of the Ottoman Empire held between the 1840s-1860s. We achieved promising results for classifying different types of objects and counting the individuals and assigning them to populated places.
机译:历史文献的数字化继续加快步伐,以进行进一步处理并从这些文献中提取含义。页面分段和布局分析对于历史文档分析系统至关重要。这些步骤中的错误将在信息检索过程中造成困难。文档的降级,数字化错误和不同的布局样式使历史文档的分割变得复杂。阿拉伯文字的属性,例如连接的字母,连字,变音符号和不同的写作风格,使得处理阿拉伯历史文献更具挑战性。在这项研究中,我们开发了一种自动系统,该系统可以使用基于CNN的体系结构对注册的个人进行计数并将其分配到人口稠密的地方。为了评估我们系统的性能,我们创建了一个带标签的寄存器数据集,该数据集是从1840年代至1860年代之间举行的奥斯曼帝国的第一批人口登记册获得的。在对不同类型的物体进行分类并对个体进行计数并将其分配到人口稠密的地方方面,我们取得了令人鼓舞的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号