首页> 外文会议>Workshop on Balto-Slavic natural language processing;Annual meeting of the Association for Computational Linguistics >Creating a Corpus for Russian Data-to-Text Generation Using Neural Machine Translation and Post-Editing
【24h】

Creating a Corpus for Russian Data-to-Text Generation Using Neural Machine Translation and Post-Editing

机译:使用神经机器翻译和后期编辑为俄罗斯数据生成文本创建语料库

获取原文
获取外文期刊封面目录资料

摘要

In this paper, we propose an approach for semi-automatically creating a data-to-tcxt (D2T) corpus for Russian that can be used to learn a D2T natural language generation model. An error analysis of the output of an English-to-Russian neural machine translation system shows that 80% of the automatically translated sentences contain an error and that 53% of all translation errors bear on named entities (NE). We therefore focus on named entities and introduce two post-editing techniques for correcting wrongly translated NEs.
机译:在本文中,我们提出了一种半自动为俄语创建数据到tcxt(D2T)语料库的方法,该方法可用于学习D2T自然语言生成模型。对英语到俄语的神经机器翻译系统的输出进行的错误分析表明,自动翻译的句子中有80%包含错误,所有翻译错误中有53%涉及命名实体(NE)。因此,我们将重点放在命名实体上,并介绍两种用于纠正错误翻译的网元的后期编辑技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号