首页> 外文会议>Annual conference of the International Speech Communication Association;INTERSPEECH 2011 >A Template Based Voice Trigger System Using Bhattacharyya Edit Distance
【24h】

A Template Based Voice Trigger System Using Bhattacharyya Edit Distance

机译:基于Bhattacharyya编辑距离的基于模板的语音触发系统

获取原文

摘要

Dynamic Time Warping (DTW) is frequently used in isolated word recognition system due to their simplicity and robustness to noise. However, the computational effort required by DTW based solution is proportional to the number of words registered in the system. Vector Quantization (VQ) is employed to alleviate this by converting the spoken input to a sequence of discrete symbols to be matched with the stored word template. In this paper, we propose the use of Bhattacharyya distance as the cost function for this pattern matching problem. The template used is a string of discrete symbols, each modeled by Gaussian Mixture Model (GMM) representing context dependent sub-word unit. The system is tested on 100 template matching task from two registrations of 50 cable TV channel names to simulate voice-triggered remote control. An average of 92% accuracy is obtained. A scheme is also proposed to enable guest user without registration data to use the system efficiently.
机译:动态时间规整(DTW)由于其简单性和对噪声的鲁棒性而经常在隔离的单词识别系统中使用。但是,基于DTW的解决方案所需的计算工作量与系统中注册的单词数成正比。通过将语音输入转换为要与存储的单词模板匹配的离散符号序列,可以采用矢量量化(VQ)来缓解这种情况。在本文中,我们建议使用Bhattacharyya距离作为此模式匹配问题的成本函数。所使用的模板是一串离散的符号,每个符号均由代表上下文相关子单词单元的高斯混合模型(GMM)建模。该系统在来自50个有线电视频道名称的两个注册的100个模板匹配任务上进行了测试,以模拟语音触发的遥控器。平均精度为92%。还提出了一种方案,以使没有注册数据的来宾用户能够有效地使用该系统。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号