首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Improving cross-domain n-gram language modelling with skipgrams
【24h】

Improving cross-domain n-gram language modelling with skipgrams

机译:使用Skipgram改进跨域n-gram语言建模

获取原文

摘要

In this paper we improve over the hierarchical Pitman-Yor processes language model in a cross-domain setting by adding skip-grams as features. We find that adding skipgram features reduces the perplexity. This reduction is substantial when models are trained on a generic corpus and tested on domain-specific corpora. We also find that within-domain testing and cross-domain testing require different backoff strategies. We observe a 30-40% reduction in perplexity in a cross-domain language modelling task, and up to 6% reduction in a within-domain experiment, for both English and Flemish-Dutch.
机译:在本文中,我们通过添加跳跃图作为功能来改进跨域设置中的分层Pitman-Yor流程语言模型。我们发现添加skipgram功能可以减少困惑。当在通用语料库上训练模型并在特定领域的语料库上进行测试时,这种减少是相当大的。我们还发现域内测试和跨域测试需要不同的退避策略。对于英语和弗拉芒语-荷兰语,我们发现跨域语言建模任务的困惑度降低了30-40%,在域内实验中降低了6%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号