首页> 外文期刊>IEEE Transactions on Speech and Audio Proceeding >Estimation of handset nonlinearity with application to speaker recognition
【24h】

Estimation of handset nonlinearity with application to speaker recognition

机译:手机非线性的估计及其在说话人识别中的应用

获取原文
获取原文并翻译 | 示例

摘要

A method is described for estimating telephone handset nonlinearity by matching the spectral magnitude of the distorted signal to the output of a nonlinear channel model, driven by an undistorted reference. This "magnitude only" representation allows the model to directly match unwanted speech formants that arise over nonlinear channels and that are a potential source of degradation in speaker and speech recognition algorithms. As such, the method is particularly suited to algorithms that use only spectral magnitude information. The distortion model consists of a memoryless nonlinearity sandwiched between two finite-length linear filters. Nonlinearities considered include arbitrary finite-order polynomials and parametric sigmoidal functionals derived from a carbon-button handset model. Minimization of a mean-squared spectral magnitude distance with respect to model parameters relies on iterative estimation via a gradient descent technique. Initial work has demonstrated the importance of addressing handset nonlinearity, in addition to linear distortion, in speaker recognition over telephone channels. A nonlinear handset "mapping," applied to training or testing data to reduce mismatch between different types of handset microphone outputs, improves speaker verification performance relative to linear compensation only. Finally, a method is proposed to merge the mapper strategy with a method of likelihood score normalization (hnorm) for further mismatch reduction and speaker verification performance improvement.
机译:描述了一种通过将失真信号的频谱幅度与非线性信道模型的输出(由未失真参考驱动)匹配来估计电话听筒非线性的方法。这种“仅幅度”表示允许模型直接匹配出现在非线性通道上的有害语音共振峰,这些语音共振峰是说话人和语音识别算法退化的潜在来源。这样,该方法特别适合于仅使用频谱幅度信息的算法。失真模型由夹在两个有限长度线性滤波器之间的无记忆非线性组成。所考虑的非线性包括从碳按钮手机模型中得出的任意有限阶多项式和参数S形函数。关于模型参数的均方频谱幅度距离的最小化依赖于通过梯度下降技术的迭代估计。最初的工作表明,除了线性失真之外,解决手机非线性问题对于通过电话通道进行扬声器识别也很重要。非线性听筒“映射”应用于训练或测试数据,以减少不同类型的听筒麦克风输出之间的失配,相对于仅线性补偿,它提高了扬声器验证性能。最后,提出了一种将映射器策略与似然评分归一化(hnorm)方法合并的方法,以进一步减少失配并提高说话者验证性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号