首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Can You Repeat That? Using Word Repetition to Improve Spoken Term Detection
【24h】

Can You Repeat That? Using Word Repetition to Improve Spoken Term Detection

机译:你可以重复一次吗?使用Word Repetition来改善口语术语检测

获取原文

摘要

We aim to improve spoken term detection performance by incorporating contextual information beyond traditional N-gram language models. Instead of taking a broad view of topic context in spoken documents, variability of word co-occurrence statistics across corpora leads us to focus instead the on phenomenon of word repetition within single documents. We show that given the detection of one instance of a term we are more likely to find additional instances of that term in the same document. We leverage this bursti-ness of keywords by taking the most confident keyword hypothesis in each document and interpolating with lower scoring hits. We then develop a principled approach to select interpolation weights using only the ASR training data. Using this re-weighting approach we demonstrate consistent improvement in the term detection performance across all five languages in the BABEL program.
机译:我们旨在通过结合超出传统的N-GRAM语言模型的上下文信息来提高口语期限检测性能。而不是在口语文件中拍摄主题背景,Corpora中的一词共同发生统计数据的可变性导致我们关注单个文档中的单词重复现象。我们表明,鉴于检测一个术语的一个实例,我们更有可能在同一文档中找到该术语的其他实例。我们通过在每个文档中采取最自信的关键字假设并与较低的评分命中进行插值来利用这一破产的关键字。然后,我们使用ASR培训数据制定一个原则性的方法来选择插值权重。使用此重新加权方法我们展示了Babel计划中所有五种语言的术语检测性能的一致性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号