Statistical Machine Translation based on LDA

机译：基于LDA的统计机器翻译

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Current Statistical Machine Translation (SMT) systems translate one sentence at a time, ignoring any document level information. Consequently, translation models are learned only at sentence level and document contexts are generally overlooked. In this paper, we try to introduce document topic to help SMT system to produce target sentences. First, the parallel training corpus with underlying document boundary is segmented into multiple documents, and then we use a monolingual LDA model to determine which topics these documents belong to. Next, the background phrase table is enhanced with the probability distribution of a document over topics. Evaluation shows that our proposed approach significantly improves the BLEU score on Chinese-to-English machine translation.

机译：当前的统计机器翻译（SMT）系统一次翻译一个句子，而忽略任何文档级别的信息。因此，翻译模型仅在句子级别学习，而文档上下文通常被忽略。在本文中，我们尝试介绍文档主题以帮助SMT系统生成目标句子。首先，将具有基础文档边界的并行训练语料库分割成多个文档，然后使用单语言LDA模型来确定这些文档属于哪些主题。接下来，通过文档在主题上的概率分布来增强背景短语表。评估表明，我们提出的方法可显着提高汉英机器翻译的BLEU分数。

著录项

来源
《Proceedings of 2010 4th International Universal Communication Symposium》|2010年|p.286-290|共5页
会议地点
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信系统（传输系统）;
关键词
Adaptation; Document; LDA; SMT;

机译：改编;文件; LDA; SMT;

相似文献

外文文献
中文文献
专利

1. Integrating Rules and Dictionaries from Shallow-Transfer Machine Translation into Phrase-Based Statistical Machine Translation [J] . P#233, rez-Ortiz Juan Antonio, S#225, The Journal of Artificial Intelligence Research . 2016,第12期

机译：将规则和词典从浅传输机器翻译集成到基于短语的统计机器翻译
2. Integrating Rules and Dictionaries from Shallow-Transfer Machine Translation into Phrase-Based Statistical Machine Translation [J] . Sanchez-Cartagena Victor M., Antonio Perez-Ortiz Juan, Sanchez-Martinez Felipe The Journal of Artificial Intelligence Research . 2016,第Null期

机译：将规则和词典从浅传输机器翻译集成到基于短语的统计机器翻译
3. Multiple Translation-Engine-based Hypotheses and Edit-Distance-based Rescoring for a Greedy Decoder for Statistical Machine Translation [J] . MICHAEL PAUL, EIICHIRO SUMITA, SEIICHI YAMAMOTO 情報処理学会論文誌 . 2005,第11期

机译：统计机器翻译的贪婪解码器基于多重翻译引擎的假设和基于编辑距离的记录
4. Statistical Machine Translation based on LDA [C] . {missing} International University Communication Symposium . 2010

机译：基于LDA的统计机器翻译
5. Statistical machine translation: Maximum entropy based translation models and search algorithms. [D] . Garcia Varea, Ismael. 2003

机译：统计机器翻译：基于最大熵的翻译模型和搜索算法。
6. Statistical-based system combination approach to gain advantages over different machine translation systems [O] . Debajyoty Banik, Asif Ekbal, Pushpak Bhattacharyya, 2019

机译：基于统计的系统组合方法来获得优于不同机器翻译系统的优势
7. Machine translation: a critical look at the performance of rule-based and statistical machine translation [O] . Brita Banitz 2020

机译：机器翻译：批判性的基于规则和统计机器翻译的表现

Statistical Machine Translation based on LDA

摘要

著录项

相似文献

相关主题

期刊订阅