首页> 外文会议>Asia-Pacific Signal and Information Processing Association Annual Summit and Conference >Single channel dereverberation method in log-melspectral domain using limited stereo data for distant speaker identification
【24h】

Single channel dereverberation method in log-melspectral domain using limited stereo data for distant speaker identification

机译:对数谱域中的单声道混响方法,使用有限的立体声数据进行远方说话人识别

获取原文

摘要

In this paper, we present a feature enhancement method that uses neural networks (NNs) to map the reverberant feature in a log-melspectral domain to its corresponding anechoic feature. The mapping is done by cascade NNs trained using Cascade 2 algorithm with an implementation of segment-based normalization. We assumed that the dimensions of feature were independent from each other and experimented on several assumptions of the room transfer function for each dimension. Speaker identification system was used to evaluate the method. Using limited stereo data, we could improve the identification rate for simulated and real datasets. On the simulated dataset, we could show that the proposed method is effective for both noiseless and noisy reverberant environments, with various noise and reverberation characteristics. On the real dataset, we could show that by using 6 independent NNs configuration for 24-dimensional feature and only 1 pair of utterances we could get 35% average error reduction relative to the baseline, which employed cepstral mean normalization (CMN).
机译:在本文中,我们提出了一种使用神经网络(NN)将对数反射谱域中的混响特征映射到其对应的消声特征的特征增强方法。映射是通过使用级联2的Cascade 2算法训练的级联NN进行的。我们假设特征的尺寸彼此独立,并针对每个尺寸对房间传递函数的几个假设进行了实验。使用说话人识别系统对方法进行评估。使用有限的立体数据,我们可以提高对模拟数据集和真实数据集的识别率。在模拟的数据集上,我们可以证明该方法对于无噪声和有噪声的混响环境均有效,并且具有各种噪声和混响特性。在真实数据集上,我们可以表明,通过使用6个独立的NN配置进行24维特征处理并且仅使用一对发声,相对于基线,采用倒谱均值归一化(CMN),我们可以获得35%的平均误差减少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号