Single channel dereverberation method in log-melspectral domain using limited stereo data for distant speaker identification

机译：对数谱域中的单声道混响方法，使用有限的立体声数据进行远方说话人识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we present a feature enhancement method that uses neural networks (NNs) to map the reverberant feature in a log-melspectral domain to its corresponding anechoic feature. The mapping is done by cascade NNs trained using Cascade 2 algorithm with an implementation of segment-based normalization. We assumed that the dimensions of feature were independent from each other and experimented on several assumptions of the room transfer function for each dimension. Speaker identification system was used to evaluate the method. Using limited stereo data, we could improve the identification rate for simulated and real datasets. On the simulated dataset, we could show that the proposed method is effective for both noiseless and noisy reverberant environments, with various noise and reverberation characteristics. On the real dataset, we could show that by using 6 independent NNs configuration for 24-dimensional feature and only 1 pair of utterances we could get 35% average error reduction relative to the baseline, which employed cepstral mean normalization (CMN).

机译：在本文中，我们提出了一种使用神经网络（NN）将对数反射谱域中的混响特征映射到其对应的消声特征的特征增强方法。映射是通过使用级联2的Cascade 2算法训练的级联NN进行的。我们假设特征的尺寸彼此独立，并针对每个尺寸对房间传递函数的几个假设进行了实验。使用说话人识别系统对方法进行评估。使用有限的立体数据，我们可以提高对模拟数据集和真实数据集的识别率。在模拟的数据集上，我们可以证明该方法对于无噪声和有噪声的混响环境均有效，并且具有各种噪声和混响特性。在真实数据集上，我们可以表明，通过使用6个独立的NN配置进行24维特征处理并且仅使用一对发声，相对于基线，采用倒谱均值归一化（CMN），我们可以获得35％的平均误差减少。

著录项

来源
《Asia-Pacific Signal and Information Processing Association Annual Summit and Conference》|2013年|1-4|共4页
会议地点
作者
Nugraha Aditya Arie; Yamamoto Kazumasa; Nakagawa Seiichi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Single Channel Dereverberation by Feature Mapping Using Limited Stereo Data [J] . Aditya Arie NUGRAHA, Kazumasa YAMAMOTO, Seiichi NAKAGAWA 電子情報通信学会技術研究報告. 音声. Speech . 2013,第161期

机译：通过使用有限立体声数据进行特征映射的单通道去混响
2. Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification [J] . Zhaofeng Zhang, Longbiao Wang, Atsuhiko Kai, EURASIP journal on audio, speech, and music processing . 2015,第1期

机译：基于深度神经网络的瓶颈特征和基于去噪自动编码器的去混响用于远距离说话者识别
3. Single-channel Dereverberation for Distant-Talking Speech Recognition by Combining Denoising Autoencoder and Temporal Structure Normalization [J] . Ueda Yuma, Wang Longbiao, Kai Atsuhiko, Journal of signal processing systems for signal, image, and video technology . 2016,第2期

机译：结合去噪自动编码器和时间结构归一化的单通道去混响用于远距离语音识别
4. Single channel dereverberation method in log-melspectral domain using limited stereo data for distant speaker identification [C] . Nugraha Aditya Arie, Yamamoto Kazumasa, Nakagawa Seiichi Asia-Pacific Signal and Information Processing Association Annual Summit and Conference . 2013

机译：Log-Melspectral域中的单通道DERERATERATION方法使用有限的立体声数据进行远程扬声器识别
5. Data-Driven Single-/Multi-Domain Spectral Methods for Stochastic Fractional PDEs [D] . Kharazmi, Ehsan. 2018

机译：用于随机分数PDE的数据驱动的单/多域谱方法
6. A novel method for approximating equilibrium single-channel Ca2+ domains [O] . Victor Matveev 2015

机译：一种逼近平衡单通道Ca2 +域的新方法
7. Single-channel dereverberation by feature mapping using cascade neural networks for robust distant speaker identification and speech recognition [O] . Aditya Arie Nugraha, Kazumasa Yamamoto, Seiichi Nakagawa 2014

机译：通过使用级联神经网络的特征映射进行单声道去混响，以实现可靠的远距离说话者识别和语音识别

Single channel dereverberation method in log-melspectral domain using limited stereo data for distant speaker identification

摘要

著录项

相似文献

相关主题

期刊订阅