【24h】

Speaker Recognition using a Non-parametric Speaker Model Representation and Earth Mover's Distance

机译:使用非参数说话人模型表示和行人距离的说话人识别

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we propose a distributed speaker recognition method using a non-parametric speaker model and Earth Mover's Distance (EMD). In distributed speaker recognition, the quantized feature vectors are sent to a server. The Gaussian mixture model (GMM), the traditional method used for speaker recognition, is trained using the maximum likelihood approach. However, it is difficult to fit continuous density functions to quantized data. To overcome this problem, the proposed method represents each speaker model with a speaker-dependent VQ code histogram designed by registered feature vectors and directly calculates the distance between the histograms of speaker models and testing quantized feature vectors. To measure the distance between each speaker model and testing data, we use EMD which can calculate the distance between histograms with different bins. We conducted text-independent speaker identification experiments using the proposed method. Compared to results using the traditional GMM, the proposed method yielded relative error reductions of 32% for quantized data.
机译:在本文中,我们提出了一种使用非参数说话人模型和地球移动者距离(EMD)的分布式说话人识别方法。在分布式说话者识别中,量化的特征向量被发送到服务器。高斯混合模型(GMM)是用于说话人识别的传统方法,使用最大似然方法进行训练。但是,很难将连续密度函数拟合到量化数据。为了克服这个问题,所提出的方法用由注册特征向量设计的依赖于说话者的VQ码直方图来代表每个说话者模型,并直接计算说话者模型的直方图与测试量化特征向量之间的距离。为了测量每个扬声器模型与测试数据之间的距离,我们使用EMD可以计算具有不同bin的直方图之间的距离。我们使用提出的方法进行了与文本无关的说话人识别实验。与使用传统GMM的结果相比,该方法对量化数据的相对误差降低了32%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号