【24h】

Cross-Lingual Lexical Triggers in Statistical Language Modeling

机译:统计语言建模中的跨语言词汇触发

获取原文
获取原文并翻译 | 示例

摘要

We propose new methods to take advantage of text in resource-rich languages to sharpen statistical language models in resource-deficient languages. We achieve this through an extension of the method of lexical triggers to the cross-language problem, and by developing a likelihood-based adaptation scheme for combining a trigger model with an N-gram model. We describe the application of such language models for automatic speech recognition. By exploiting a side-corpus of contemporaneous English news articles for adapting a static Chinese language model to transcribe Mandarin news stories, we demonstrate significant reductions in both perplexity and recognition errors. We also compare our cross-lingual adaptation scheme to monolingual language model adaptation, and to an alternate method for exploiting cross-lingual cues, via cross-lingual information retrieval and machine translation, proposed elsewhere.
机译:我们提出了一种新方法,以利用资源丰富的语言中的文本来增强资源匮乏的语言中的统计语言模型。我们通过将词汇触发方法扩展到跨语言问题,以及通过开发将触发模型与N-gram模型相结合的基于似然的自适应方案,来实现这一目标。我们描述了这种语言模型在自动语音识别中的应用。通过利用当代英语新闻的辅助语录来改编静态中文语言模型来转录普通话新闻报道,我们证明了困惑和识别错误的显着降低。我们还将跨语言适应方案与单语言模型适应进行比较,并与其他地方提出的通过跨语言信息检索和机器翻译来利用跨语言线索的替代方法进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号