首页> 外文会议>Insternational Joint Conference on Natural Language Processing; 20040322-24; Sanya(CN) >Capturing Long Distance Dependency in Language Modeling: An Empirical Study
【24h】

Capturing Long Distance Dependency in Language Modeling: An Empirical Study

机译:在语言建模中获取长距离依赖关系的一项实证研究

获取原文
获取原文并翻译 | 示例

摘要

This paper presents an extensive empirical study on two language modeling techniques, linguistically-motivated word skipping and predictive clustering, both of which are used in capturing long distance word dependencies that are beyond the scope of a word trigram model. We compare the techniques to others that were proposed previously for the same purpose. We evaluate the resulting models on the task of Japanese Kana-Kanji conversion. We show that the two techniques, while simple, outperform existing methods studied in this paper, and lead to language models that perform significantly better than a word trigram model. We also investigate how factors such as training corpus size and genre affect the performance of the models.
机译:本文对两种语言建模技术进行了广泛的实证研究,这两种语言建模技术是出于语言动机的单词跳过和预测聚类,二者均用于捕获超出单词三字组模型范围的长距离单词依存关系。我们将这些技术与以前为相同目的提出的其他技术进行了比较。我们评估日语假名-汉字转换任务的结果模型。我们证明了这两种技术虽然简单,却优于本文研究的现有方法,并导致语言模型的性能明显优于单词三字组模型。我们还研究了训练语料库大小和体裁等因素如何影响模型的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号