首页> 外文会议>Future Technologies Conference >From Textline to Paragraph: A Promising Practice for Chinese Text Recognition

【24h】

From Textline to Paragraph: A Promising Practice for Chinese Text Recognition

机译：从TextLine到段落：中国文本认可的有希望的惯例

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Although handwritten Chinese text recognition (HCTR) has achieved tremendous progress in the past decades, the traditional document analysis system suffers from two main problems: (1) The annotation of position and transcript at line level is costly to obtain; (2) The framework consists of several separately trained modules, and it's difficult for the complex system to get satisfying results. Therefore, handwritten paragraph recognition attempts to incorporate the textline segmentation and recognition into a complete network. However, large character set and great insufficient training samples make it troublesome for handwritten Chinese paragraph recognition (HCPR). In this paper, a novel framework is proposed for HCPR. To make the training process faster and more stable, we put forward the Multi-Dimensional LSTM Convolutional Attention (MLCA) recognition framework. A new writing-style-aware image synthesis method is utilized as well to overcome the problem of data insufficiency. We conduct several experiments on the ICDAR-2013 competition dataset and the corresponding corrupted dataset. From the compelling results, we can draw an encouraging conclusion that it would be a promising trend to move from HCTR to HCPR for Chinese document analysis system.

机译：虽然在过去的几十年中，手写中国文本认可（HCTR）取得了巨大的进展，但传统的文件分析系统遭受了两个主要问题：（1）衡量线级的批量和成绩单是昂贵的; （2）该框架由几个单独培训的模块组成，复杂系统难以满足令人满意的结果。因此，手写段落识别尝试将TextLine分段和识别纳入完整的网络。但是，大角色集和巨大的训练样本都使其对手写的中文段落识别（HCPR）进行麻烦。本文提出了一种用于HCPR的新颖框架。为了使培训过程更快，更稳定，我们提出了多维LSTM卷积注意力（MLCA）识别框架。利用新的写作风格感知图像合成方法，也可以克服数据不足的问题。我们在ICDAR-2013竞争数据集和相应损坏的数据集中进行若干实验。从引人注目的结果，我们可以吸引一个令人鼓舞的结论，即从HCTR到中国文档分析系统的HCTR将是一个有希望的趋势。

著录项

来源
《Future Technologies Conference》|2021年|xiii 971 pages :|共16页
会议地点
作者
Yichao Wu; Xiaolin Hu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 73.9083;
关键词
Text recognition; Multi-Dimensional LSTM Convolutional Attention; Paragraph recognition; End-to-end;

机译：文本识别;多维LSTM卷积注意;段落识别;结束;

相似文献

外文文献
中文文献
专利

1. Exploring the Factors Influencing the Effectiveness of Employee Recognition:An Analysis of Recognition Practices in Chinese Businesses [J] . Zeng Xi 当代社会科学（英文） . 2021,第001期

机译：影响影响员工认可效力的因素：中国企业识别实践分析
2. Deep neural network-based recognition of entities in Chinese online medical inquiry texts [J] . Xin Liu, Yanju Zhou, Zongrun Wang Future generation computer systems . 2021,第Jana期

机译：基于深度神经网络的中文在线医学查询文本的实体认可
3. Robust Recognition of Chinese Text from Cellphone-acquired Low-quality Identity Card Images Using Convolutional Recurrent Neural Network [J] . Jianmei Wang, Ruize Wu, Shaoming Zhang Sensors and materials . 2021,第4期

机译：使用卷积经常性神经网络从手机获取的低质量识别卡片图像中恢复中文文本的鲁棒识别
4. From Textline to Paragraph: A Promising Practice for Chinese Text Recognition [C] . Yichao Wu, Xiaolin Hu Future Technologies Conference . 2021

机译：从TextLine到段落：中国文本认可的有希望的惯例
5. Land pawning practices in republican China: Theory and reality (Chinese text). [D] . Fang, Huirong. 2002

机译：民国时期的土地典当实践：理论与现实（中文文本）。
6. Entity recognition in Chinese clinical text using attention-based CNN-LSTM-CRF [O] . Buzhou Tang, Xiaolong Wang, Jun Yan, 2019

机译：使用基于注意的CNN-LSTM-CRF在中文临床文本中进行实体识别
7. Research on Chinese Short Text Emotional Polarity Recognition Based on Specific Immune Recognition [O] . J. Ma, H. Qiao 2015

机译：基于特定免疫识别的中国短文本情绪极性识别研究

From Textline to Paragraph: A Promising Practice for Chinese Text Recognition

摘要

著录项

相似文献

相关主题

期刊订阅