首页> 外文会议>International Conference on Image Analysis and Processing >Enriching Character-Based Neural Machine Translation with Modern Documents for Achieving an Orthography Consistency in Historical Documents
【24h】

Enriching Character-Based Neural Machine Translation with Modern Documents for Achieving an Orthography Consistency in Historical Documents

机译:以现代文档丰富基于字符的神经机器翻译,以实现历史文献中的正射法一致性

获取原文

摘要

The nature of human language and the lack of a spelling convention make historical documents hard to handle for natural language processing. Spelling normalization tackles this problem by adapting their spelling to modern standards in order to get an orthography consistency. In this work, we compare several character-based machine translation approaches, and propose a method to profit from modern documents to enrich neural machine translation models. We tested our proposal with four different data sets, and observed that the enriched models successfully improved the normalization quality of the neural models. Statistical models, however, yielded a better result.
机译:人类语言的性质和缺乏拼写公约使历史文件难以处理自然语言处理。通过将拼写调整到现代标准,拼写规范化来解决这个问题,以便进行正射刻度。在这项工作中,我们比较了几种基于角色的机器翻译方法,并提出了一种从现代文件中获利的方法来丰富神经机器翻译模型。我们通过四种不同的数据集测试了我们的建议,并观察到丰富的模型成功地改善了神经模型的标准化质量。然而,统计模型产生了更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号