...
首页> 外文期刊>Acoustical science and technology >Multistream sparse representation features for noise robust audio-visual speech recognition
【24h】

Multistream sparse representation features for noise robust audio-visual speech recognition

机译:多流稀疏表示功能可实现强大的抗噪视听语音识别

获取原文
           

摘要

References(22) In this paper, we propose to use exemplar-based sparse representation features for noise robust audio-visual speech recognition. First, we introduce a sparse representation technology and describe how noise robustness can be realized by the sparse representation for noise reduction. Then, feature fusion methods are proposed to combine audio-visual features with the sparse representation. Our work provides new insight into two crucial issues in automatic speech recognition: noise reduction and robust audio-visual features. For noise reduction, we describe a noise reduction method in which speech and noise are mapped into different subspaces by the sparse representation to reduce the noise. Our proposed method can be deployed not only on audio noise reduction but also on visual noise reduction for several types of noise. For the second issue, we investigate two feature fusion methods –- late feature fusion and the joint sparsity model method –- to calculate audio-visual sparse representation features to improve the accuracy of the audio-visual speech recognition. Our proposed method can also contribute to feature fusion for the audio-visual speech recognition system. Finally, to evaluate the new sparse representation features, a database for audio-visual speech recognition is used in this research. We show the effectiveness of our proposed noise reduction on both audio and visual cases for several types of noise and the effectiveness of audio-visual feature determination by the joint sparsity model, in comparison with the late feature fusion method and traditional methods.
机译:参考文献(22)本文提出将基于样本的稀疏表示特征用于噪声鲁棒的视听语音识别。首先,我们介绍一种稀疏表示技术,并介绍如何通过稀疏表示实现降噪的鲁棒性。然后,提出了特征融合方法,将视听特征与稀疏表示相结合。我们的工作为自动语音识别中的两个关键问题提供了新的见解:降噪和强大的视听功能。对于降噪,我们描述了一种降噪方法,其中通过稀疏表示将语音和噪声映射到不同的子空间中以降低噪声。我们提出的方法不仅可以用于音频降噪,而且还可以用于多种类型的噪声的视觉降噪。对于第二个问题,我们研究了两种特征融合方法-后特征融合和联合稀疏模型​​方法-计算视听稀疏表示特征以提高视听语音识别的准确性。我们提出的方法还可以为视听语音识别系统的特征融合做出贡献。最后,为了评估新的稀疏表示特征,本研究使用了用于视听语音识别的数据库。与最新的特征融合方法和传统方法相比,我们展示了针对多种类型的噪声在音频和视频情况下建议的降噪效果,以及通过联合稀疏模型​​确定视听特征的效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号