...
首页> 外文期刊>Pattern recognition letters >Robust speaker modeling using perceptually motivated feature
【24h】

Robust speaker modeling using perceptually motivated feature

机译:使用感性动机功能进行健壮的扬声器建模

获取原文
获取原文并翻译 | 示例
           

摘要

This paper introduces a novel method to extract robust features for text-independent speaker identification from short utterances. This method is perceptually motivated and inspired by the perceptual linear prediction (PLP) technique. The new feature is called perceptual log area ratio (PLAR). It is perceptual in the sense that it depends on notions from psychoacoustics where the robustness can be assured. Also, the log area ratio is an effective feature for recognizing speakers as it embodies the geometry and dynamics of the vocal tract, which are very much person-dependent. This research thus focuses on providing a reliable vocal biometric from speakers, which can be used effectively with full-band and telephone-band speech in noisy environments. Intensive performance analysis has been performed to benchmark the proposed method against the commonly-used features using different databases in different noisy environments. In almost all usable cases the PLAR proved its superiority over the commonly-used features such as MFCC and LPCC.
机译:本文介绍了一种新颖的方法,可从短话中提取鲁棒的特征以用于与文本无关的说话人识别。该方法受感知线性预测(PLP)技术的启发和启发。新功能称为感知对数面积比(PLAR)。从某种意义上说,这是可以感知的,它取决于可以确保鲁棒性的心理声学概念。此外,对数面积比是识别说话者的有效特征,因为它体现了声道的几何形状和动态特性,而声道的几何特性和动态特性很大程度上取决于人。因此,本研究着重于从扬声器提供可靠的声音生物特征,可以在嘈杂的环境中有效地与全频带和电话频带语音配合使用。已经进行了密集的性能分析,以在不同的噪声环境中使用不同的数据库,针对常用功能对所提出的方法进行基准测试。在几乎所有可用的情况下,PLAR证明了其优于MFCC和LPCC等常用功能的优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号