首页> 外文会议>IEEE International Conference on Acoustics, Speech, and Signal Processing >USE OF STATISTICAL N-GRAM MODELS IN NATURAL LANGUAGE GENERATION FOR MACHINE TRANSLATION
【24h】

USE OF STATISTICAL N-GRAM MODELS IN NATURAL LANGUAGE GENERATION FOR MACHINE TRANSLATION

机译:在机器翻译中使用统计n-gram模型的自然语言生成

获取原文
获取外文期刊封面目录资料

摘要

Various language modeling issues in a speech-to-speech translation system are described in this paper. First, the language models for the speech recognizer need to be adapted to the specific domain to improve the recognition performance for in-domain utterances, while keeping the domain coverage as broad as possible. Second, when a maximum entropy based statistical natural language generation model is used to generate target language sentence as the translation output, serious inflection and synonym issues arise, because the compromised solution is used in semantic representation to avoid data sparseness problem. We use N-gram models as a postprocessing step to enhance the generation performance. When an interpolated language model is applied to a Chinese-to-English translation task, the translation performance, measured by an objective metric of BLEU, improves substantially to 0.514 from 0.318 when we use the correct transcription as input. Similarly, the BLEU score is improved to 0.300 from 0.194 for the same task when the input is speech data.
机译:本文描述了语音转换系统中的各种语言建模问题。首先,语音识别器的语言模型需要适应特定域以改善域发声的识别性能,同时保持域覆盖尽可能宽。其次,当使用基于最大熵的统计自然语言生成模型来生成目标语言句子作为翻译输出时,出现了严重的拐点和同义词问题,因为受损的解决方案用于语义表示,以避免数据稀疏问题。我们使用n-gram模型作为后处理步骤,以提高生成性能。当内插语言模型应用于中文翻译任务时,当我们使用正确的转录作为输入时,通过BLEU的客观度量测量,通过BLEU的客观度量测量,从0.318提高到0.514。类似地,当输入是语音数据时,BLEU分数从0.194提高到0.300。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号