首页> 外文会议>Annual conference of the International Speech Communication Association;INTERSPEECH 2011 >Zero-resource audio-only spoken term detection based on a combination of template matching techniques
【24h】

Zero-resource audio-only spoken term detection based on a combination of template matching techniques

机译:基于模板匹配技术的零资源纯音频语音术语检测

获取原文

摘要

Spoken term detection is a well-known information retrieval task that seeks to extract contentful information from audio by locating occurrences of known query words of interest. This paper describes a zero-resource approach to such task based on pattern matching of spoken term queries at the acoustic level. The template matching module comprises the cascade of a segmental variant of dynamic time warping and a self-similarity matrix comparison to further improve robustness to speech variability. This solution notably differs from more traditional train and test methods that, while shown to be very accurate, rely upon the availability of large amounts of linguistic resources. We evaluate our framework on different param-eterizations of the speech templates: raw MFCC features and Gaussian posteriorgrams, French and English phonetic posteri-orgrams output by two different state of the art phoneme recognizers.
机译:语音术语检测是一项众所周知的信息检索任务,旨在通过查找感兴趣的已知查询词的出现来从音频中提取有意义的信息。本文介绍了一种基于零级资源的语音任务查询方法,该方法基于声学级别的口语术语查询模式匹配。模板匹配模块包括动态时间规整的分段变体的级联和自相似矩阵比较,以进一步提高对语音可变性的鲁棒性。该解决方案明显不同于更传统的训练和测试方法,后者虽然显示非常准确,但依赖于大量语言资源的可用性。我们根据语音模板的不同参数评估我们的框架:原始的MFCC特征和高斯后验图,两种不同状态的音素识别器输出的法语和英语语音后验图。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号