首页> 外文OA文献 >Noise Robust Automatic Speech Recognition Based on Spectro-Temporal Techniques

【2h】

Noise Robust Automatic Speech Recognition Based on Spectro-Temporal Techniques

机译：基于光谱时态技术的噪声鲁棒自动语音识别

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speech technology today has a wide variety of existing and potential applications in so many areas of our life. From dictating systems to voice translation, from digital assistants like Siri, Google Now, and Cortana, to telephone dialogue systems. Many of these applications have to rely on an Automatic Speech Recognition (ASR) component. This component not only has to perform well, but it also has to perform well in adverse environments. After all, a dictating system which requires that we insulate our office, or a digital assistant that cannot work in traffic, or in a room full of chatting people is not so helpful. For this reason, noise robust ASR has been a topic of intensive research. Yet, human-equivalent performance has not been achieved. This motivated many to search for ways to improve the robustness of automatic speech recognition based on human speech perception. One popular method inspired by the examination of the receptive fields of auditory neurons is that of spectro-temporal processing.ududIn spectro-temporal processing, the aim is to capture the spectral and temporal modulations of the signal simultaneously. One simple way to do so is to extract the features to be used from spectro-temporal patches, and then use the resulting features in the same manner one would use traditional features like MFCCs. There is more than one way to bake a cake, however. And in this case this is true twice over. For one, there are various ways to extract our features from the patches. But there are other, more sophisticated ways to incorporate the concept of spectro-temporal processing into a speech recognition system. In this study we examine many such methods -- some simpler, some more sophisticated, but all stemming from the same basic idea. By the end of this study we will demonstrate that these methods can indeed lead to more robust speech recognition. So much so, that they can provide results that are competitive with the state-of-the-art results.

机译：当今的语音技术在我们生活的许多领域中具有广泛的现有和潜在应用。从口述系统到语音翻译，从Siri，Google Now和Cortana等数字助理到电话对话系统。这些应用程序中的许多都必须依赖于自动语音识别（ASR）组件。该组件不仅必须具有良好的性能，而且还必须在不利的环境中具有良好的性能。毕竟，要求我们隔离办公室，不能在交通中不能工作的数字助理或在闲聊的房间中的命令系统并没有太大帮助。因此，抗噪ASR一直是深入研究的主题。但是，尚未实现与人类相当的性能。这激发了许多人寻求基于人类语音感知来提高自动语音识别的鲁棒性的方法。一种受听觉神经元接受域检查启发的流行方法是光谱时间处理。 ud ud在光谱时间处理中，目的是同时捕获信号的频谱和时间调制。一种简单的方法是从频谱时态补丁中提取要使用的功能，然后以与使用诸如MFCC之类的传统功能相同的方式使用生成的功能。但是，烘烤蛋糕有多种方法。在这种情况下，这是正确的两次。首先，有多种方法可以从补丁中提取我们的功能。但是，还有其他更复杂的方法可以将频谱时处理的概念整合到语音识别系统中。在本研究中，我们研究了许多这样的方法-一些更简单，更复杂的方法，但是所有这些方法都是基于相同的基本思想。到本研究结束时，我们将证明这些方法确实可以导致更强大的语音识别。如此之多，他们可以提供与最新结果相抗衡的结果。

著录项

作者
Kovács György;
展开▼
作者单位

展开▼
年度 100
总页数
原文格式 PDF
正文语种 en
中图分类

相似文献

外文文献
中文文献
专利

1. Novel Variations of Group Sparse Regularization Techniques With Applications to Noise Robust Automatic Speech Recognition [J] . Qun Feng Tan, Narayanan S.S. Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第4期

机译：群稀疏正则化技术的新变体及其在噪声鲁棒自动语音识别中的应用
2. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition [J] . Shimada Kazuki, Bando Yoshiaki, Mimura Masato, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第5期

机译：基于多通道NMF信息波束形成的无监督语音增强技术，用于强噪声自动语音识别
3. Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition [J] . Shimada Kazuki, Bando Yoshiaki, Mimura Masato, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2019,第5期

机译：基于多通道NMF的噪声强度自动语音识别的无监督语音增强
4. Modelling spectro-temporal dynamics in factorisation-based noise-robust automatic speech recognition [C] . Hurmalainen, Antti IEEE International Conference on Acoustics, Speech and Signal Processing;ICASSP . 2012

机译：在基于因数分解的鲁棒自动语音识别中对时空动力学建模
5. Array-based Spectro-temporal Masking for Automatic Speech Recognition. [D] . Moghimi, Amir R. 2014

机译：基于阵列的频谱时域掩蔽，用于自动语音识别。
6. Threshold-Based Noise Detection and Reduction for Automatic Speech Recognition System in Human-Robot Interactions [O] . Sheng-Chieh Lee, Jhing-Fa Wang, Miao-Hia Chen 2018

机译：人机交互中基于阈值的自动语音识别系统噪声检测与消减
7. Noise Robust Automatic Speech Recognition with Adaptive Quantile Based Noise Estimation and Speech Band Emphasizing Filter Bank [O] . Casper Stork Bonde, Carina Graversen, Andreas Gregers Gregersen, 2008

机译：基于自适应分位数的噪声估计和语音带增强滤波器组的鲁棒自动语音识别

Noise Robust Automatic Speech Recognition Based on Spectro-Temporal Techniques

摘要

著录项

相似文献

相关主题

期刊订阅