首页> 外文会议>ACM workshop on searching spontaneous conversational speech 2010 >Direct Posterior Confidence for Out-of-Vocabulary Spoken Term Detection
【24h】

Direct Posterior Confidence for Out-of-Vocabulary Spoken Term Detection

机译:词汇外口语检测的直接后验置信度

获取原文
获取原文并翻译 | 示例

摘要

Spoken term detection (STD) is a fundamental task in spoken information retrieval. Compared to conventional speech transcription and keyword spotting, STD is an open-vocabulary task and is necessarily required to address out-of-vocabul-ary (OOV) terms. Approaches based on subword units, e.g. phonemes, are widely used to solve the OOV issue; however, performance on OOV terms is still significantly inferior to that for in-vocabulary (INV) terms. The performance degradation on OOV terms can be attributed to a multitude of factors. A particular factor we address in this paper is that the acoustic and language models used for speech transcribing are highly vulnerable to OOV terms, which leads to unreliable confidence measures and error-prone detections. A direct posterior confidence measure that is derived from discriminative models has been proposed for STD. In this paper, we utilize this technique to tackle the weakness of OOV terms in confidence estimation. Neither acoustic models nor language models being included in the computation, the new confidence avoids the weak modeling problem with OOV terms. Our experiments, set up on multi-party meeting speech which is highly spontaneous and conversational, demonstrate that the proposed technique improves STD performance on OOV terms significantly; when combined with conventional lattice-based confidence, a significant improvement in performance is obtained on both INVs and OOVs. Furthermore, the new confidence measure technique can be combined together with other advanced techniques for OOV treatment, such as stochastic pronunciation modeling and term-dependent confidence discrimination, which leads to an integrated solution for OOV STD with greatly improved performance.
机译:语音术语检测(STD)是语音信息检索中的一项基本任务。与常规语音转录和关键字查找相比,STD是一项公开的任务,并且必须解决词汇外(OOV)术语。基于子词单位的方法,例如音素,被广泛用于解决OOV问题;但是,OOV术语的性能仍然明显低于词汇内(INV)术语的性能。 OOV术语的性能下降可归因于多种因素。我们在本文中解决的一个特殊因素是,用于语音转录的声学和语言模型极易受到OOV术语的影响,这会导致不可靠的置信度度量和易于出错的检测。从判别模型导出的直接后置置信度量度已被建议用于STD。在本文中,我们利用该技术来解决置信度估计中OOV项的弱点。计算中既没有声学模型也没有语言模型,新的置信度避免了OOV术语的弱建模问题。我们的实验基于高度自发和对话的多方会议演讲,证明了所提出的技术可以显着提高OOD条件下的性病表现;当与常规的基于格的置信度结合使用时,INV和OOV的性能都将得到显着改善。此外,新的置信度测量技术可以与其他用于OOV处理的高级技术结合使用,例如随机发音建模和与术语相关的置信度判别,这导致OOV STD的集成解决方案性能大大提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号