首页> 外文会议>International Conference on Document Analysis and Recognition >A Character Attention Generative Adversarial Network for Degraded Historical Document Restoration
【24h】

A Character Attention Generative Adversarial Network for Degraded Historical Document Restoration

机译:用于退化历史文档恢复的人物关注生成对抗网络

获取原文

摘要

Despite of recent breakthroughs in the accuracy of single character recognition using the deeper convolution neural networks, one of the remaining problems is that OCRs almost fail to recognize character patterns when they are severely degraded, especially those of the historical documents. Another problem to recognize characters in historical documents is the lack of sufficient training patterns because of the heavy cost for annotation. This paper proposes a character attention generative adversarial network named CAGAN for restoring heavily degraded character patterns in historical documents so that OCRs improve their accuracy and even help archeologists to decode them. The network is based on the U-Net like architecture [1] with skip connections, and it is trained by the proposed loss function including the common adversarial loss (global loss) and the hierarchical character attentive loss (local loss). We made an experiment on 118 categories of most common Japanese Kanji characters, collected from severely damaged historical documents called Heijokyo mokkan written during the Nara period in Japan. The experiment shows that our method restores the shapes of characters and improves the recognition rate significantly, which is helpful for archeologists to decode damaged character patterns.
机译:尽管在使用较深卷积神经网络的单个字符识别的准确性近期突破,剩余的问题之一是,几乎同时进行文本识别无法识别的字符模式时,他们严重退化,尤其是那些历史文献。承认历史文献字符的另一个问题是缺乏的,因为标注了沉重的代价足够的训练模式。本文提出了一个名为卡根字符注意生成对抗性的网络,以历史文件恢复严重退化的字符模式,使同时进行文本识别提高其准确度,甚至帮助考古学家来解码。网络是基于U形网状结构[1]与跳过的连接,它是由包括共同对抗损耗(全球损失)和分层字符细心损失(局部损失)所提出的损失函数培训。我们对118类最常见的日语汉字字符,从严重受损叫做平城京在奈良时代日本mokkan写历史文献收集做了一个实验。实验表明,我们的方法恢复字符的形状,提高了显著的识别率,这是考古学家解码损坏的字符模式有帮助的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号