Translation Model Based Cross-Lingual Language Model Adaptation: from Word Models to Phrase Models

机译：基于翻译模型的跨语言模型适应：从单词模型到短语模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a novel translation model (TM) based cross-lingual data selection model for language model (LM) adaptation in statistical machine translation (SMT), from word models to phrase models. Given a source sentence in the translation task, this model directly estimates the probability that a sentence in the target LM training corpus is similar. Compared with the traditional approaches which utilize the first pass translation hypotheses, cross-lingual data selection model avoids the problem of noisy proliferation. Furthermore, phrase TM based cross-lingual data selection model is more effective than the traditional approaches based on bag-of-words models and word-based TM, because it captures contextual information in modeling the selection of phrase as a whole. Experiments conducted on large-scale data set-s demonstrate that our approach significantly outperforms the state-of-the-art approaches on both LM perplexity and SMT performance.

机译：在本文中，我们提出了一种新颖的基于翻译模型（TM）的跨语言数据选择模型，用于从单词模型到短语模型的统计机器翻译（SMT）中的语言模型（LM）适应。给定翻译任务中的源句子，此模型直接估计目标LM训练语料库中的句子相似的概率。与采用首过翻译假设的传统方法相比，跨语言数据选择模型避免了噪声扩散的问题。此外，基于短语TM的跨语言数据选择模型比基于词袋模型和基于单词的TM的传统方法更有效，因为它在对短语选择进行整体建模时捕获了上下文信息。在大规模数据集上进行的实验表明，我们的方法在LM复杂度和SMT性能方面均明显优于最新方法。

著录项

来源
《Conference on empirical methods in natural language processing;Conference on computational natural language learning》|2012年|512-522|共11页
会议地点
作者
Shixiang Lu; Wei Wei; Xiaoyin Fu; Bo Xu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Improving Phrase-Based Statistical Machine Translation Models by Incorporating Syntax-Based Language Models [J] . 陈毅东, 史晓东东华大学学报：英文版 . 2010,第002期

机译：通过结合基于语法的语言模型来改进基于短语的统计机器翻译模型
2. Improving Phrase-Based Statistical Machine Translation Models by Incorporating Syntax-Based Language Models [J] . CHEN Yi-dong, SHI Xiao-dong 东华大学学报（英文版） . 2010,第002期

机译：通过结合基于语法的语言模型来改进基于短语的统计机器翻译模型
3. Domain Adaptation Based on Mixture of Latent Words Language Models for Automatic Speech Recognition [J] . Ryo MASUMURA, Taichi ASAMI, Takanobu OBA, IEICE transactions on information and systems . 2018,第6期

机译：基于潜在词语言模型混合的领域自适应语音自动识别
4. Translation Model Based Cross-Lingual Language Model Adaptation: from Word Models to Phrase Models [C] . Shixiang Lu, Wei Wei, Xiaoyin Fu, MNLP 2012 . 2012

机译：基于翻译模型的交叉语言模型适应：从单词模型到短语模型
5. Multilingual model using cross-lingual word embeddings based on subword alignment and cross-task projection利用統計を見る [D] . Sakuma Jin 2019

机译：使用基于子词对齐和跨任务投影的跨语言词嵌入的多语言模型
6. Word-level language modeling for P300 spellers based on discriminative graphical models [O] . Jaime F Delgado Saa, Adriana de Pesters, Dennis McFarland, -1

机译：基于区别性图形模型的P300拼写单词级语言建模
7. Language model adaptation for ASR of spoken translations using phrase-based translation models and named entity models [O] . Pelemans Joris, Vanallemeersch Tom, Demuynck Kris, 2016

机译：使用基于短语的翻译模型和命名实体模型对口语翻译的ASR进行语言模型调整
8. Incremental Syntactic Language Models for Phrase-Based Translation. [R] . Schwartz, L., Callison-Burch, C., Schuler, W., 2011

机译：基于短语的翻译的增量句法语言模型。

Translation Model Based Cross-Lingual Language Model Adaptation: from Word Models to Phrase Models

摘要

著录项

相似文献

相关主题

期刊订阅