首页> 外文会议>ELMAR 2012 Proceedings. >Performance comparison of several techniques to detect keywords in audio streams and audio scene
【24h】

Performance comparison of several techniques to detect keywords in audio streams and audio scene

机译:几种检测音频流和音频场景中关键字的技术的性能比较

获取原文
获取原文并翻译 | 示例

摘要

This paper is focused on the task of detecting words of interest in an audio scene (a room, a lab or a workshop) or in a continually recorded stream of speech, music and other sounds. The solution of this task is important in many applications, e.g. for command control in houses for handicapped persons, for automating some manufacturing and logistical operations, or for information retrieval from large audio archives. We investigate the use of three keyword spotting techniques and compare them with a classic large vocabulary sp eech reco gnit ion sy st em. To evaluat e t heir performance, we specified and studied two model applications: 1) search in large audio broadcast archive; 2) voice control of an interactive system. The investigated techniques were evaluated from several points of view, namely their speed (real-time factor), accuracy (equal error rate, figure of merit, receiver op erating characteristics), the demands for training data and the impact of different types of noise.
机译:本文的重点是在音频场景(房间,实验室或车间)或连续记录的语音,音乐和其他声音流中检测感兴趣的单词的任务。此任务的解决方案在许多应用中都很重要,例如用于残疾人的房屋中的命令控制,一些制造和物流操作的自动化,或从大型音频档案中检索信息。我们研究了三种关键字发现技术的使用,并将它们与经典的大词汇表语音识别系统进行比较。为了评估继承人的表现,我们指定并研究了两个模型应用程序:1)在大型音频广播档案中搜索; 2)交互式系统的语音控制。从多个角度对研究的技术进行了评估,即它们的速度(实时因子),准确性(相等错误率,品质因数,接收机工作特性),对训练数据的需求以及不同类型噪声的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号