首页> 外文期刊>ACM transactions on Asian language information processing >Converting Continuous-Space Language Models into N-gram Language Models with Efficient Bilingual Pruning for Statistical Machine Translation
【24h】

Converting Continuous-Space Language Models into N-gram Language Models with Efficient Bilingual Pruning for Statistical Machine Translation

机译:通过高效的双语修剪将连续空间语言模型转换为N-gram语言模型以进行统计机器翻译

获取原文
获取原文并翻译 | 示例
       

摘要

The Language Model (LM) is an essential component of Statistical Machine Translation (SMT). In this article, we focus on developing efficient methods for LM construction. Our main contribution is that we propose a Natural iV-grams based Converting (NNGC) method for transforming a Continuous-Space Language Model (CSLM) to a Back-off iV-gram Language Model (BNLM). Furthermore, a Bilingual LM Pruning (BLMP) approach is developed for enhancing LMs in SMT decoding and speeding up CSLM converting. The proposed pruning and converting methods can convert a large LM efficiently by working jointly. That is, a LM can be effectively pruned before it is converted from CSLM without sacrificing performance, and further improved if an additional corpus contains out-of-domain information. For different SMT tasks, our experimental results indicate that the proposed NNGC and BLMP methods outperform the existing counterpart approaches significantly in BLEU and computational cost.
机译:语言模型(LM)是统计机器翻译(SMT)的重要组成部分。在本文中,我们重点研究开发用于LM构造的有效方法。我们的主要贡献是,我们提出了一种基于自然iV-gram的转换(NNGC)方法,用于将连续空间语言模型(CSLM)转换为退避iV-gram语言模型(BNLM)。此外,开发了双语LM修剪(BLMP)方法,以增强SMT解码中的LM并加速CSLM转换。提出的修剪和转换方法可以通过共同工作有效地转换大型LM。也就是说,在从CSLM转换LM之前可以对其进行有效修剪,而不会牺牲性能,如果其他语料库包含域外信息,则可以进一步改进LM。对于不同的SMT任务,我们的实验结果表明,所提出的NNGC和BLMP方法在BLEU和计算成本上明显优于现有的对应方法。

著录项

  • 来源
  • 作者单位

    Center for Brain-like Computing and Machine Intelligence, Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, China, 200240;

    Multilingual Translation Laboratory, National Institute of Information and Communications Technology, 3-5 Hikaridai, Kei-hanna Science City, Kyoto 619-0289, Japan;

    NHK and National Institute of Information and Communications Technology,NHK Science & Technology Research Laboratories, 1-10-11 Kinuta, Setagaya-ku, Tokyo 157-8510, Japan;

    Multilingual Translation Laboratory, National Institute of Information and Communications Technology, 3-5 Hikaridai, Keihanna Science City, Kyoto 619-0289, Japan;

    Center for Brain-like Computing and Machine Intelligence, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, China, 200240;

    Center for Brain-like Computing and Machine Intelligence, Department of Computer Science and Engineering, Shanghai Jiao Tong University, Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, 800 Dongchuan Road, Shanghai, China, 200240;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Machine translation; continuous-space language model; neural network language model; language model pruning;

    机译:机器翻译;连续空间语言模型;神经网络语言模型;语言模型修剪;
  • 入库时间 2022-08-17 13:41:10

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号