首页> 外文会议>International Conference on Intelligent Robotics and Systems >Missing-Feature-Theory-based Robust Simultaneous Speech Recognition System with Non-clean Speech Acoustic Model
【24h】

Missing-Feature-Theory-based Robust Simultaneous Speech Recognition System with Non-clean Speech Acoustic Model

机译:基于缺失的特征理论的强大立即语音识别系统,具有非清洁语音声学模型

获取原文

摘要

A humanoid robot must recognize a target speech signal while people around the robot chat with them in real-world. To recognize the target speech signal, robot has to separate the target speech signal among other speech signals and recognize the separated speech signal. As separated signal includes distortion, automatic speech recognition (ASR) performance degrades. To avoid the degradation, we trained an acoustic model from non-clean speech signals to adapt acoustic feature of distorted signal and adding white noise to separated speech signal before extracting acoustic feature. The issues are (1) To determine optimal noise level to add the training speech signals, and (2) To determine optimal noise level to add the separated signal. In this paper, we investigate how much noises should be added to clean speech data for training and how speech recognition performance improves for different positions of three talkers with soft masking. Experimental results show that the best performance is obtained by adding white noises of 30 dB. The ASR with the acoustic model outperforms with ASR with the clean acoustic model by 4 points.
机译:人形机器人必须识别目标语音信号,而机器人周围的人与他们在现实世界中聊天。为了识别目标语音信号,机器人必须在其他语音信号中分离目标语音信号并识别分离的语音信号。由于分离信号包括失真,自动语音识别(ASR)性能下降。为避免降级,我们从非清洁语音信号训练了声学模型,以使失真信号的声学特征调节,并在提取声学特征之前将白噪声添加到分离的语音信号。问题是(1)确定最佳噪声水平,以添加训练语音信号,(2)以确定最佳噪声水平以添加分离信号。在本文中,我们调查了应对清洁语音数据进行培训的噪音以及语音识别性能如何改善三个讲话者的不同掩码的不同位置。实验结果表明,通过添加30 dB的白色噪声获得了最佳性能。与声学模型的ASR与ASR具有4分的清洁声学模型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号