Same Domain Different Discourse Style A Case Study on Language Resources for Data-driven Machine Translation

机译：相同领域不同话语风格的数据驱动机器翻译语言资源案例研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data-driven machine translation (MT) approaches became very popular during last years, especially for language pairs for which it is difficult to find specialists to develop transfer rules. Statistical (SMT) or example-based (EBMT) systems can provide reasonable translation quality for assimilation purposes, as long as a large amount of training data is available. Especially SMT systems rely on parallel aligned corpora which have to be statistical relevant for the given language pair. The construction of large domain specific parallel corpora is time- and cost-consuming; the current practice relies on one or two big such corpora per language pair. Recent developed strategies ensure certain portability to other domains through specialized lexicons or small domain specific corpora. In this paper we discuss the influence of different discourse styles on statistical machine translation systems. We investigate how a pure SMT performs when training and test data belong to same domain but the discourse style varies.

机译：数据驱动的机器翻译（MT）方法在最近几年变得非常流行，尤其是对于难以找到专家来制定传输规则的语言对。只要有大量训练数据可用，统计（SMT）或基于示例的（EBMT）系统就可以为同化目的提供合理的翻译质量。特别是SMT系统依赖于并行对齐的语料库，该语料库必须与给定语言对在统计上相关。大型领域特定并行语料库的构建既费时又费钱;当前的做法是每个语言对依靠一两个大型语料库。最近开发的策略可确保通过专门的词典或特定于小型域的语料库将某些域移植到其他域。在本文中，我们讨论了不同话语风格对统计机器翻译系统的影响。我们研究了当训练和测试数据属于同一领域但话语风格不同时，纯SMT的性能如何。

著录项

来源
《International conference on language resources and evaluation》|2012年|3441-3446|共6页
会议地点
作者
Monica Gavrila; Walther v. Hahn; Cristina Vertan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Statistical machine translation; Discourse style; evaluation of Machine Translation; Linguistic analysis of training data; Moses;

机译：统计机器翻译;话语风格;机器翻译评估;培训数据的语言分析;摩西;

相似文献

外文文献
中文文献
专利

1. Finding Translation Examples for Under-Resourced Language Pairs or for Narrow Domains; the Case for Machine Translation [J] . Dan Tufis Computer science journal of Moldova . 2012,第2期

机译：查找资源不足的语言对或狭窄域的翻译示例；机器翻译案例
2. Improving Statistical Machine Translation for a Resource-Poor Language Using Related Resource-Rich Languages [J] . Nakov P., Ng H. T. The Journal of Artificial Intelligence Research . 2012,第4期

机译：使用相关的资源丰富的语言改善资源贫乏的语言的统计机器翻译
3. Improving Statistical Machine Translation for a Resource-Poor Language Using Related Resource-Rich Languages [J] . Preslav Nakov, Hwee Tou Ng The Journal of Artificial Intelligence Research . 2012,第Null期

机译：使用相关的资源丰富的语言改善资源贫乏的语言的统计机器翻译
4. Same Domain Different Discourse Style A Case Study on Language Resources for Data-driven Machine Translation [C] . Monica Gavrila, Walther Hahn, Cristina Vertan LREC-2012 . 2012

机译：相同的域不同话语风格是数据驱动机器翻译语言资源的案例研究
5. Style and interpretation: A comparison of target languages concerning the verbal aspect with the aid of Bible translations. Can new perspectives be found for the science of translation by comparative studies of target languages? [D] . Wirt, Heinz Peter. 1998

机译：风格和诠释：借助圣经翻译，比较目标语言在语言方面的表现。通过对目标语言的比较研究，可以为翻译科学找到新的观点吗？
6. Machine Translation-Supported Cross-Language Information Retrieval for a Consumer Health Resource [O] . Graciela Rosemblat, Darren Gemoets, Allen C. Browne, 2003

机译：消费者健康资源的机器翻译支持的跨语言信息检索
7. Adaptation in Statistical Machine Translation for Low-resource Domains in English-Vietnamese Language [O] . Nghia-Luan Pham, Van-Vinh Nguyen 2020

机译：适应英语 - 越南语中低资源域的统计机器翻译
8. Improving Domain-specific Machine Translation by Constraining the Language Model. [R] . Micher, J. C. 2012

机译：通过约束语言模型改进特定领域的机器翻译。

Same Domain Different Discourse Style A Case Study on Language Resources for Data-driven Machine Translation

摘要

著录项

相似文献

相关主题

期刊订阅