首页> 外文会议>European Conference on Speech Communication and Technology - EUROSPEECH 2003(INTERSPEECH 2003) vol.2; 20030901-04; Geneva(CH) >AN EFFICIENT KEYWORD SPOTTING TECHNIQUE USING A COMPLEMENTARY LANGUAGE FOR FILLER MODELS TRAINING
【24h】

AN EFFICIENT KEYWORD SPOTTING TECHNIQUE USING A COMPLEMENTARY LANGUAGE FOR FILLER MODELS TRAINING

机译:使用补充语言进行填充模型训练的有效关键词发现技术

获取原文
获取原文并翻译 | 示例

摘要

The task of keyword spotting is to detect a set of keywords in the input continuous speech. In a keyword spotter, not only the keywords, but also the non-keyword intervals must be modeled. For this purpose, filler (or garbage) models are used. To date, most of the keyword spotters have been based on hidden Markov models (HMM). More specifically, a set of HMM is used as garbage models. In this paper, a two-pass keyword spotting technique based on bilingual hidden Markov models is presented. In the first pass, our technique uses phonemic garbage models to represent the non-keyword intervals, and in the second stage the putative hits are verified using normalized scores. The main difference from similar approaches lies in the way the non-keyword intervals are modeled. In this work, the target language is Japanese, and English was chosen as the 'garbage' language for training the phonemic garbage models. Experimental results on both clean and noisy telephone speech data showed higher performance compared with using a common set of acoustic models. Moreover, parameter tuning (e.g. word insertion penalty tuning) does not have a serious effect on the performance. For a vocabulary of 100 keywords and using clean telephone speech test data we achieved a 92.04% recognition rate with only a 7.96% false alarm rate, and without word insertion penalty tuning. Using noisy telephone speech test data we achieved a 87.29% recognition rate with only a 12.71% false alarm rate.
机译:关键字发现的任务是检测输入的连续语音中的一组关键字。在关键字搜索器中,不仅必须对关键字进行建模,而且还必须对非关键字间隔进行建模。为此,使用填充(或垃圾)模型。迄今为止,大多数关键字搜寻器都基于隐马尔可夫模型(HMM)。更具体地说,将一组HMM用作垃圾模型。本文提出了一种基于双语隐马尔可夫模型的两遍关键词发现技术。在第一遍中,我们的技术使用音位垃圾模型来表示非关键字间隔,在第二阶段中,使用归一化分数来验证推定命中。与类似方法的主要区别在于对非关键字间隔进行建模的方式。在这项工作中,目标语言是日语,并且英语被选为用于训练音位垃圾模型的“垃圾”语言。与使用普通的声学模型集相比,在干净和嘈杂的电话语音数据上的实验结果均显示出更高的性能。而且,参数调整(例如单词插入罚分调整)对性能没有严重影响。对于一个包含100个关键字的词汇表以及使用干净的电话语音测试数据,我们实现了92.04%的识别率,而误报率仅为7.96%,并且没有单词插入惩罚调整。使用嘈杂的电话语音测试数据,我们达到了87.29%的识别率,而误报率仅为12.71%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号