首页> 外文期刊>Circuits, systems, and signal processing >Binaural Classification-Based Speech Segregation and Robust Speaker Recognition System
【24h】

Binaural Classification-Based Speech Segregation and Robust Speaker Recognition System

机译:基于双分类的语音分离和健壮的说话人识别系统

获取原文
获取原文并翻译 | 示例

摘要

The paper presents an auditory scene analyser that comprises of two joint simultaneous modules, namely binaural speech segregation and speaker recognition. The binaural speech segregation is realized by incorporating interaural time and level differences, interaural phase difference and interaural coherence along with direct-to-reverberant ratio into deep recurrent neural network. The performance of deep recurrent network-based speech segregation is validated in terms of source to interference ratio, source to distortion ratio and source to artifacts ratio and compared with existing architectures including deep neural network. It is observed that performance of conventional deep recurrent neural network can be improved further by involving discriminative objectives along with soft time-frequency masking as a layer in the network structure. The system also proposes a spectro-temporal extractor which is referred as Gabor-Hilbert envelope coefficients (GHEC). The proposed monaural feature is responsible for extracting discriminative acoustic information from segregated speech sources. The performance of GHEC is validated under various noisy and reverberant environments and the results are compared with existing monaural features. The results of binaural speech segregation have shown better signal-to-noise ratio at an average of 0.7 dB even in the presence of higher reverberation time, 0.89 s over other baseline algorithms.
机译:本文提出了一种听觉场景分析器,该分析器包括两个联合的同时模块,即双耳语音分离和说话人识别。双耳语音分离是通过将耳间时间和水平差异,耳间相位差和耳间连贯性以及直接与混响比合并到深度递归神经网络中来实现的。在信源与干扰之比,信源与失真之比,信源与伪像之比方面验证了基于深度递归网络的语音隔离的性能,并与包括深度神经网络的现有体系结构进行了比较。可以看出,通过将判别目标与软时频掩蔽作为网络结构中的一层,可以进一步改善常规深度递归神经网络的性能。该系统还提出了一种频谱时间提取器,称为Gabor-Hilbert包络系数(GHEC)。所提出的单声道功能负责从分离的语音源中提取具有区别性的声音信息。 GHEC的性能在各种嘈杂和混响环境下得到了验证,并将结果与​​现有的单声道功能进行了比较。双耳语音隔离的结果显示,即使存在更高的混响时间(比其他基准算法为0.89 s),平均0.7 dB的信噪比也更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号