首页> 外文会议>International Conference on Document Analysis and Recognition >A Binarization-Free Clustering Approach to Segment Curved Text Lines in Historical Manuscripts
【24h】

A Binarization-Free Clustering Approach to Segment Curved Text Lines in Historical Manuscripts

机译:一种无二进制化聚类方法,可以在历史手稿中进行弯曲文本线

获取原文

摘要

Text line segmentation is one of the main parts of document image analysis, it provides crucial information for automated reading, word spotting, alignment between image and transcription, or indexing of documents. Yet it remains an open problem for handwritten historical documents because of complex layouts on the one hand, such as curved and touching text lines, and binarization problems on the other hand, caused by ornaments, wrinkles, stains, holes, etc. In this paper, we propose a binarization-free clustering method for text line segmentation that is not only able to cope with touching text lines, but also with complex baseline curvature. Avoiding the assumption of straight baselines, small interest point clusters are grouped into text lines based on their local orientation. Experiments conducted on artificially distorted images of the Saint Gall database show promising results.
机译:文本线段是文档图像分析的主要部分之一,它为自动阅读,单词斑点,图像和转录之间的对齐或文件索引提供了重要信息。然而,它仍然是手写历史文档的开放问题,因为一方面是复杂的布局,例如弯曲和触摸文本线,另一方面,由装饰品,皱纹,污渍,孔等引起的二值化问题,我们提出了一种自由化的聚类方法,用于文本线分割,不仅能够应对触摸文本线,还具有复杂的基线曲率。避免了直基线的假设,小兴趣点集群基于其本地方向分组为文本线。在圣胆管数据库的人为扭曲图像上进行的实验表明了有希望的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号