首页> 外文期刊>Journal of visual communication & image representation >A novel method of text line segmentation for historical document image of the uchen Tibetan
【24h】

A novel method of text line segmentation for historical document image of the uchen Tibetan

机译:uchen藏语历史文献形象的文本线分割方法

获取原文
获取原文并翻译 | 示例
       

摘要

Text line segmentation is a key step in Tibetan historical document recognition. A novel method for text line segmentation was proposed based on the baseline in uchen Tibetan, and a new dataset was released, which was used to evaluate the results of text line segmentation of uchen Tibetan historical documents. In this paper, there were two steps for the proposed method: baseline detection and text line segmentation using the baseline. In baseline detection, the upper edges of all characters in the document were obtained by a horizontal gradient operator, then an edge connectivity definition was proposed by which the upper edge set was divided into disjoint subsets. Eligible sets were selected from these subsets, and the edges in these sets were joined in turn to obtain the baseline. In text line segmentation, the document image was truncated at the baseline position, then the adhesion regions were segmented again. Each connected region in the image was assigned to its nearest baseline. All connected regions belonging to the same baseline formed a text line. Experiments on the proposed dataset showed that the method could effectively avoid document distortion, the accuracy of text line segmentation was high, and the text line adhesion could be handled. (C) 2019 Published by Elsevier Inc.
机译:文本线段是藏文档识别的关键步骤。基于Uchen Tibetan的基线提出了一种新颖的文本线路分割方法,并释放了一个新的数据集,用于评估uchen藏历史文件的文本线分割结果。在本文中,所提出的方法有两个步骤:使用基线进行基线检测和文本线分割。在基线检测中,文档中所有字符的上边缘由水平梯度操作员获得,然后提出了边缘组的边缘组被分成不相交的子集。符合条件的设置选中来自这些子集,并依次加入这些集中的边缘以获得基线。在文本线分割中,文档图像在基线位置截断,然后再次分段粘附区域。图像中的每个连接区域被分配到其最接近的基线。属于相同基线的所有连接区域形成了文本线。在所提出的数据集上的实验表明,该方法可以有效地避免文档失真,文本线分割的准确性很高,可以处理文本线条粘附。 (c)2019年由elsevier公司发布

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号