Keyword Spotting in Continuous Speech Using Spectral and Prosodic Information Fusion

Pandey Laxmi; Hegde Rajesh M.

首页> 外文期刊>Circuits, systems, and signal processing >Keyword Spotting in Continuous Speech Using Spectral and Prosodic Information Fusion

【24h】

Keyword Spotting in Continuous Speech Using Spectral and Prosodic Information Fusion

机译：频谱和韵律信息融合在连续语音中发现关键词

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Keyword spotting in a continuous speech is a challenging problem and has relevance in applications like audio indexing and music retrieval. In this work, the problem of keyword spotting is addressed by utilizing the complementary information present in spectral and prosodic features of the speech signal. A thorough analysis of the complementary information is performed on a large Hindi language database developed for this purpose. Phonetic and prosodic distribution analysis is performed toward this end, using canonical correlation and Student T-distance function. Motivated by these analyses, novel methods for spectral and prosodic information fusion that optimize a combined error function is proposed. The fusion methods are developed both at the feature and the model level. Improved syllable sequence prediction and keyword spotting performance are obtained using these methods when compared to conventional methods of keyword spotting. Additionally, in order to enable comparison with the state-of-the-art deep learning-based methods, a novel method for improved syllable sequence prediction using deep denoising autoencoders is proposed. The performance of the methods proposed in this work is evaluated for keyword spotting using a syllable sliding protocol over a large Hindi database. Reasonable performance improvements are noted from the experimental results on syllable sequence prediction, keyword spotting, and audio retrieval.

机译：在连续语音中发现关键词是一个具有挑战性的问题，并且与音频索引和音乐检索等应用程序相关。在这项工作中，通过利用语音信号频谱和韵律特征中存在的补充信息解决了关键词发现的问题。为此目的而开发的大型印地语数据库对补充信息进行了全面分析。为此，使用规范相关和学生T距离函数进行语音和韵律分布分析。受这些分析的启发，提出了一种优化频谱和韵律信息融合组合误差函数的新方法。融合方法是在特征和模型级别上开发的。与传统的关键词搜索方法相比，使用这些方法可以获得更好的音节序列预测和关键词搜索性能。此外，为了能够与基于深度学习的最新方法进行比较，提出了一种使用深度去噪自动编码器改进音节序列预测的新方法。在大型印地语数据库上使用音节滑动协议评估了本文中提出的方法的性能，以发现关键字。从音节序列预测，关键词识别和音频检索的实验结果中可以注意到，合理的性能改进。

著录项

来源
《Circuits, systems, and signal processing 》 |2019年第6期| 2767-2791| 共25页
作者
Pandey Laxmi; Hegde Rajesh M.;
展开▼
作者单位

Indian Inst Technol Kanpur, Dept Elect Engn, Kanpur, Uttar Pradesh, India;

Indian Inst Technol Kanpur, Dept Elect Engn, Kanpur, Uttar Pradesh, India;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Deep denoising autoencoder; Keyword spotting; Hidden Markov models; Deep neural network; Speech recognition;

机译：深度去噪自动编码器;关键词识别;隐马尔可夫模型;深度神经网络;语音识别;

相似文献

外文文献
中文文献
专利

1. Keyword Spotting in Continuous Speech Using Spectral and Prosodic Information Fusion [J] . Pandey Laxmi, Hegde Rajesh M. Circuits, systems, and signal processing . 2019 ,第6期

机译：使用光谱和韵律信息融合在连续演讲中发现关键词
2. Improving the performance of keyword spotting system for children's speech through prosody modification [J] . Shahnawazuddin S., Maity Karabi, Pradhan Gayadhar Digital Signal Processing . 2019 ,第期

机译：通过韵律修改提高儿童演讲的关键字发现系统的性能
3. A Russian Keyword Spotting System Based on Large Vocabulary Continuous Speech Recognition and Linguistic Knowledge [J] . Valentin Smirnov, Dmitry Ignatov, Michael Gusev, Journal of electrical and computer engineering . 2016 ,第PTa2期

机译：基于大词汇量连续语音识别和语言知识的俄语关键词点播系统
4. Multi-Keyword Spotting of Telephone Speech Using Orthogonal Transform-Based SBR and RNN Prosodic Model [C] . Wern-Jun Wang, Chun-Jen Lee, Eng-Fong Huang, European conference on speech communication and technology . 2001

机译：基于正交变换的SBR和RNN韵律模型的电话语音多关键词
5. Keyword spotting using a fusion of spectral, cepstral and AM-FM modulation features [D] . Chu, Tao 2010

机译：结合频谱，倒频谱和AM-FM调制功能的关键字识别
6. Children’s Recognition of Emotional Prosody in Spectrally-Degraded Speech is Predicted by Their Age and Cognitive Status [O] . Anna R Tinnemore, Danielle J Zion, Aditya M Kulkarni, -1

机译：根据年龄和认知状况来预测儿童在频谱退化语音中对情绪韵律的认知
7. Prototypical Metric Transfer Learning for Continuous Speech Keyword Spotting with Limited Training Data [O] . Harshita Seth, Pulkit Kumar, Muktabh Mayank Srivastava 2019

机译：具有有限培训数据的连续语音关键字的原型公制传输学习
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Keyword Spotting in Continuous Speech Using Spectral and Prosodic Information Fusion

摘要

著录项

相似文献

相关主题

期刊订阅