Enriching Character-Based Neural Machine Translation with Modern Documents for Achieving an Orthography Consistency in Historical Documents

机译：以现代文档丰富基于字符的神经机器翻译，以实现历史文献中的正射法一致性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The nature of human language and the lack of a spelling convention make historical documents hard to handle for natural language processing. Spelling normalization tackles this problem by adapting their spelling to modern standards in order to get an orthography consistency. In this work, we compare several character-based machine translation approaches, and propose a method to profit from modern documents to enrich neural machine translation models. We tested our proposal with four different data sets, and observed that the enriched models successfully improved the normalization quality of the neural models. Statistical models, however, yielded a better result.

机译：人类语言的性质和缺乏拼写公约使历史文件难以处理自然语言处理。通过将拼写调整到现代标准，拼写规范化来解决这个问题，以便进行正射刻度。在这项工作中，我们比较了几种基于角色的机器翻译方法，并提出了一种从现代文件中获利的方法来丰富神经机器翻译模型。我们通过四种不同的数据集测试了我们的建议，并观察到丰富的模型成功地改善了神经模型的标准化质量。然而，统计模型产生了更好的结果。

著录项

来源
《International Conference on Image Analysis and Processing》|2019年|xiv 406 p.|共11页
会议地点
作者
Miguel Domingo; Francisco Casacuberta;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类信息处理（信息加工）;
关键词
入库时间 2022-08-20 20:19:10

相似文献

外文文献
中文文献
专利

1. Document-Level Neural Machine Translation with Associated Memory Network [J] . Shu JIANG, Rui WANG, Zuchao LI, IEICE transactions on information and systems . 2021,第10期

机译：文档级神经机与相关内存网络的翻译
2. Recognizing the orthography changes for identifying the temporal origin on the example of the Balkan historical documents [J] . Brodic Darko, Amelio Alessia Neural computing & applications . 2019,第8期

机译：认识到识别BALKAN历史文档示例的时间原点的正射法变化
3. Computational Analysis of Medieval Manuscripts: A New Tool for Analysis and Mapping of Medieval Documents to Modern Orthography [J] . Mushtaq Ahmad, Stefan Gruner, Muhammad Tanvir Afzal Journal of Universal Computer Science . 2012,第20期

机译：中世纪手稿的计算分析：中世纪文献对现代正交图的分析与映射的新工具
4. Enriching Character-Based Neural Machine Translation with Modern Documents for Achieving an Orthography Consistency in Historical Documents [C] . Miguel Domingo, Francisco Casacuberta International Conference on Image Analysis and Processing;International Workshop on Recent Advances in Digital Security: Bio-metrics and Forensics;Workshop on Deep Understanding Shopper Behaviours and Interactions in Intelligent Retail Environments;International Workshop on Pattern Recognition for Cultural Heritage;Industrial Session;International Workshop on eHealth in the Big Data and Deep Learning Era . 2019

机译：利用现代文献丰富基于字符的神经机器翻译，以实现历史文献中拼字法的一致性
5. Speech Based Machine Aided Human Translation for a Document Translation Task . [D] . Reddy, Aarthi. 2012

机译：基于语音的机器辅助人工翻译用于文档翻译任务。
6. Care Consistency with Documented Care Preferences: Methodologic Considerations for Implementing the Measuring What Matters Quality Indicator [O] . Kathleen T. Unroe, Susan E. Hickman, Alexia M. Torke -1

机译：护理与书面护理偏好的一致性：实施衡量重要质量指标的方法学注意事项
7. Using Word Embeddings to Enforce Document-Level Lexical Consistency in Machine Translation [O] . Garcia Eva Martínez, Creus Carles, España-Bonet Cristina, 2017

机译：利用Word嵌入实现机器翻译中的文档级词汇一致性

Enriching Character-Based Neural Machine Translation with Modern Documents for Achieving an Orthography Consistency in Historical Documents

摘要

著录项

相似文献

相关主题

期刊订阅