首页> 外文会议>IEEE International Conference on Acoustics, Speech and Signal Processing >Physiologically-motivated feature extraction for speaker identification
【24h】

Physiologically-motivated feature extraction for speaker identification

机译:生理动机特征提取以识别说话人

获取原文

摘要

This paper introduces the use of three physiologically-motivated features for speaker identification, Residual Phase Cepstrum Coefficients (RPCC), Glottal Flow Cepstrum Coefficients (GLFCC) and Teager Phase Cepstrum Coefficients (TPCC). These features capture speaker-discriminative characteristics from different aspects of glottal source excitation patterns. The proposed physiologically-driven features give better results with lower model complexities, and also provide complementary information that can improve overall system performance even for larger amounts of data. Results on speaker identification using the YOHO corpus demonstrate that these physiologically-driven features are both more accurate than and complementary to traditional mel-frequency cepstral coefficients (MFCC). In particular, the incorporation of the proposed glottal source features offers significant overall improvement to the robustness and accuracy of speaker identification tasks.
机译:本文介绍了使用三种生理动机特征进行说话人识别的方法:残余相位倒谱系数(RPCC),声门倒谱系数(GLFCC)和Teager倒谱系数(TPCC)。这些特征从声门源激励模式的不同方面捕获了说话人的辨别特征。拟议的生理学驱动的特征以较低的模型复杂度提供了更好的结果,并且还提供了即使对于大量数据也可以改善整体系统性能的补充信息。使用YOHO语料库对说话人进行识别的结果表明,这些生理驱动的特征比传统的mel-frequency倒谱系数(MFCC)更为精确和互补。尤其是,所提出的声门源特征的结合大大提高了说话人识别任务的鲁棒性和准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号