A novel pitch extraction based on jointly trained deep BLSTM Recurrent Neural Networks with bottleneck features

机译：基于联合训练的深蓝色的新型BLSTM复发性神经网络具有瓶颈特征的新型俯仰萃取

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Pitch is an important characteristic of speech and is useful for many applications. However, it is still challenging to estimate pitch in strong noise. In this paper, we propose a joint training approach to determinate pitch. First, a Bidirectional Long Short-Term Memory Recurrent Neural Networks (BLSTMRNN) is trained to map the noisy to clean speech features. Second, the pitch estimation is also a BLSTM-RNN model. The feature mapping neural network serves as a noise normalization module aiming at explicitly generating the clean features which are easier to estimate pitch by the following neural network. BLSTM-RNN is trained on sequential frame-level features and capable of learning temporal dynamics. We also propose to take into account bottleneck features for pitch estimation. The experimental results show that the proposed method can obtain accurate pitch estimation and they show good generalization ability to new speakers and noisy conditions. The proposed approach also significantly outperforms other state-of-the-art pitch estimation algorithms.

机译：音高是言语的重要特征，对许多应用有用。然而，估计强大的噪音的音调仍然挑战。在本文中，我们提出了一种联合培训方法来确定音高。首先，培训双向长期内记忆经常性神经网络（BLSTMRNN）以映射嘈杂以清洁语音特征。其次，间距估计也是BLSTM-RNN模型。所述特征映射神经网络作为噪声归一化模块，旨在明确地生成清洁特征，其更容易通过以下的神经网络来估计音调。 BLSTM-RNN培训在顺序帧级别特征上，并能够学习时间动态。我们还建议考虑到音高估计的瓶颈功能。实验结果表明，该方法可以获得准确的音高估计，它们对新扬声器和嘈杂的条件表现出良好的概括能力。所提出的方法也显着优于其他最先进的音高估计算法。

著录项

来源
《IEEE International Conference on Acoustics, Speech and Signal Processing》|2017年|650p|共5页
会议地点
作者
Bin Liu; Jianhua Tao; Dawei Zhang; Yibin Zheng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN912-53;
关键词
Pitch estimation; BLSTM-RNN; feature mapping; joint training; bottleneck features;

机译：音高估计;BLSTM-RNN;特征映射;联合培训;瓶颈特征;

相似文献

外文文献
中文文献
专利

1. Gender classification based on isolated facial features and foggy faces using jointly trained deep convolutional neural network [J] . Aslam Aasma, Hussain Babar, Cetin Ahmet Enis, Journal of electronic imaging . 2018,第PTa2期

机译：联合训练的深度卷积神经网络基于孤立的面部特征和模糊的面孔进行性别分类
2. Network traffic classification using deep convolutional recurrent autoencoder neural networks for spatial-temporal features extraction [J] . DAngelo Gianni, Palmieri Francesco Journal of network and computer applications . 2021,第Jana期

机译：网络流量分类使用深卷积复制自动化器神经网络进行空间时间特征提取
3. Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification [J] . Zhaofeng Zhang, Longbiao Wang, Atsuhiko Kai, EURASIP journal on audio, speech, and music processing . 2015,第1期

机译：基于深度神经网络的瓶颈特征和基于去噪自动编码器的去混响用于远距离说话者识别
4. A novel pitch extraction based on jointly trained deep BLSTM Recurrent Neural Networks with bottleneck features [C] . Bin Liu, Jianhua Tao, Dawei Zhang, IEEE International Conference on Acoustics, Speech and Signal Processing . 2017

机译：基于具有瓶颈特征的联合训练的深度BLSTM递归神经网络的新颖音高提取
5. Deep Neural Language Model for Text Classification Based on Convolutional and Recurrent Neural Networks [D] . Hassan, Abdalraouf. 2018

机译：基于卷积神经网络和递归神经网络的深度神经语言文本分类模型
6. An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition [O] . Alicia Lozano-Diez, Ruben Zazo, Doroteo T. Toledano, -1

机译：深度神经网络（DNN）拓扑对基于瓶颈特征的语言识别的影响分析
7. Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification [O] . Zhaofeng Zhang, Longbiao Wang, Atsuhiko Kai, 2015

机译：基于深度神经网络的瓶颈特征和基于去噪自动编码器的去混响用于远距离说话者识别

A novel pitch extraction based on jointly trained deep BLSTM Recurrent Neural Networks with bottleneck features

摘要

著录项

相似文献

相关主题

期刊订阅