RICH SYSTEM COMBINATION FOR KEYWORD SPOTTING IN NOISY AND ACOUSTICALLY HETEROGENEOUS AUDIO STREAMS

机译：丰富的系统组合，用于嘈杂和声学的异构音频流中的关键字斑点

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We address the problem of retrieving spoken information from noisy and heterogeneous audio archives using system combination with a rich and diverse set of noise-robust modules. Audio search applications so far have focused on constrained domains or genres and not-so-noisy and heterogeneous acoustic or channel conditions. In this paper, our focus is to improve the accuracy of a keyword spotting system in highly degraded and diverse channel conditions by employing multiple recognition systems in parallel with different robust frontends and modeling choices, as well as different representations during audio indexing and search (words vs. subword units). After aligning keyword hits from different systems, we employ system combination at the score level using a logistic-regression-based classifier. Side information such as the output of an acoustic condition identification module is used to guide system combination system that is trained on a held-out dataset. Lattice-based indexing and search is used in all keyword spotting systems. We present improvements in probability-miss at a fixed probability-false-alarm by employing our proposed rich system combination approach on DARPA Robust Automatic Transcription of Speech (RATS) PhaseI evaluation data that contains highly degraded channel recordings (signal-to-noise ratio levels as low as 0 dB) and different channel characteristics.

机译：我们解决检索使用了丰富而多样的噪音，耐用型模块系统组合嘈杂的，异构的音频档案语音信息的问题。音频搜索应用至今都集中在受限的域或流派和不那么嘈杂声异质或信道条件。在本文中，我们的重点是通过用不同的鲁棒前端并行使用多个识别系统和建模过程中的音频索引的选择，以及不同的表示，以提高在高度降解的和多样的信道条件的关键词定位系统的准确度和搜索（字与子字单元）。来自不同系统对准的关键字点击之后，我们采用系统组合在比分级使用基于逻辑回归分类。侧信息，诸如声学条件识别模块的输出用于引导被在保持输出数据集训练的系统的组合系统。格为基础的索引和搜索是在所有的关键词识别系统中使用。我们采用的辞DARPA强大的自动转录（RATS），其中包含高度退化声道录音（信噪比水平PhaseI评估数据我们提出的丰富的系统相结合的办法概率错过本发明的改进在一个固定的概率假警报低至0 dB为单位）和不同的信道特性。

著录项

来源
《IEEE International Conference on Acoustics, Speech, and Signal Processing》|2013年||共5页
会议地点
作者
Murat Akbacak; Lukas Burget; Wen Wang; Julien van Hout;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词

相似文献

外文文献
中文文献
专利

1. Communication in a noisy environment: short-term acoustic adjustments and the underlying acoustic niche of a Neotropical stream-breeding frog [J] . Caldart Vinicius Matheus, Iop Samanta, Lingnau Rodrigo, Acta ethologica . 2016,第3期

机译：在嘈杂的环境中进行交流：短期的声音调整和新热带河蛙的潜在声音利基
2. Audio-visual keyword spotting for access technology in children with cerebral palsy and speech impairment [J] . Orlandi Silvia, Huang Jiaqui, McGillivray Josh, Assistive technology: the official journal of RESNA . 2019,第5期

机译：脑瘫和语音障碍儿童接入技术的视听关键字发现
3. A Novel Lip Descriptor for Audio-Visual Keyword Spotting Based on Adaptive Decision Fusion [J] . Wu Pingping, Liu Hong, Li Xiaofei, Multimedia, IEEE Transactions on . 2016,第3期

机译：基于自适应决策融合的新型视听关键词识别口语描述符
4. Rich system combination for keyword spotting in noisy and acoustically heterogeneous audio streams [C] . Akbacak Murat, Burget Lukas, Wang Wen, IEEE International Conference on Acoustics, Speech and Signal Processing . 2013

机译：丰富的系统组合，可在嘈杂的和听觉上异质的音频流中发现关键字
5. Design of Keyword Spotting System Based on Segmental Time Warping of Quantized Features. [D] . Karmacharya, Piush. 2012

机译：基于量化特征分段时间规整的关键词识别系统设计。
6. Multi-stream LSTM-HMM decoding and histogram equalization for noise robust keyword spotting [O] . Martin Wöllmer, Erik Marchi, Stefano Squartini, 2011

机译：多流LSTM-HMM解码和直方图均衡以增强噪声健壮关键字
7. RICH SYSTEM COMBINATION FOR KEYWORD SPOTTING IN NOISY AND ACOUSTICALLY HETEROGENEOUS AUDIO STREAMS [O] . Murat Akbacak, Lukas Burget, Wen Wang, 2013

机译：关于在噪声和声学异质音频流中关键词的富集系统组合
8. Rich System Combination For Keyword Spotting In Noisy and Acoustically Heterogeneous Audio Streams. [R] . Akbacak, M., Burget, L., Wang, W., 2013

机译：用于噪声和声学异构音频流中关键字定位的丰富系统组合。

RICH SYSTEM COMBINATION FOR KEYWORD SPOTTING IN NOISY AND ACOUSTICALLY HETEROGENEOUS AUDIO STREAMS

摘要

著录项

相似文献

相关主题

期刊订阅