首页> 外文会议>International Conference on Culture-oriented Science and Technology >LSTM Based End-to-End Text-Independent Speaker Verification Using Raw Waveform
【24h】

LSTM Based End-to-End Text-Independent Speaker Verification Using Raw Waveform

机译:使用原始波形的基于LSTM的端到端文本无关的说话人验证

获取原文

摘要

Speaker can be discriminated either at voice source level or vocal tract system level. Conventionally Mel-Frequency Cesptral Coefficients (MFCCs) or Mel filterbank energies are employed as input acoustic feature in neural network based speaker verification systems. In this paper, we investigate the LSTM based speaker verification using raw waveform as input feature. The basic LSTM based SV model and the model with attention layer are trained and optimized on two datasets using raw waveform feature and Fbank feature respectively. And experimental results show that compared with the model trained using Fbank feature, the model trained using raw waveform can achieve promising performance, raw waveform is a competitive acoustic feature for LSTM based speaker verification.
机译:可以在语音源级别或声道系统级别区分说话者。常规地,在基于神经网络的说话者验证系统中,采用梅尔频率中枢系数(MFCC)或梅尔滤波器组能量作为输入声学特征。在本文中,我们研究了使用原始波形作为输入功能的基于LSTM的扬声器验证。基于LSTM的基本SV模型和带有关注层的模型分别使用原始波形特征和Fbank特征在两个数据集上进行了训练和优化。实验结果表明,与使用Fbank功能训练的模型相比,使用原始波形训练的模型可以实现有希望的性能,原始波形是基于LSTM的说话人验证的竞争声学功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号