Speaker Recognition using a Non-parametric Speaker Model Representation and Earth Mover's Distance

Yoshiyuki UMEDA; Satoru TSUGE; Fuji REN; Shingo KUROIWA

首页> 外文期刊>電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication >Speaker Recognition using a Non-parametric Speaker Model Representation and Earth Mover's Distance

【24h】

Speaker Recognition using a Non-parametric Speaker Model Representation and Earth Mover's Distance

机译：使用非参数说话人模型表示和行人距离的说话人识别

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose a distributed speaker recognition method using a non-parametric speaker model and Earth Mover's Distance (EMD). In distributed speaker recognition, the quantized feature vectors are sent to a server. The Gaussian mixture model (GMM), the traditional method used for speaker recognition, is trained using the maximum likelihood approach. However, it is difficult to fit continuous density functions to quantized data. To overcome this problem, the proposed method represents each speaker model with a speaker-dependent VQ code histogram designed by registered feature vectors and directly calculates the distance between the histograms of speaker models and testing quantized feature vectors. To measure the distance between each speaker model and testing data, we use EMD which can calculate the distance between histograms with different bins. We conducted text-independent speaker identification experiments using the proposed method. Compared to results using the traditional GMM, the proposed method yielded relative error reductions of 32% for quantized data.

机译：在本文中，我们提出了一种使用非参数说话人模型和地球移动者距离（EMD）的分布式说话人识别方法。在分布式说话者识别中，量化的特征向量被发送到服务器。高斯混合模型（GMM）是用于说话人识别的传统方法，使用最大似然方法进行训练。但是，很难将连续密度函数拟合到量化数据。为了克服这个问题，所提出的方法用由注册特征向量设计的依赖于说话者的VQ码直方图来代表每个说话者模型，并直接计算说话者模型的直方图与测试量化特征向量之间的距离。为了测量每个扬声器模型与测试数据之间的距离，我们使用EMD可以计算具有不同bin的直方图之间的距离。我们使用提出的方法进行了与文本无关的说话人识别实验。与使用传统GMM的结果相比，该方法对量化数据的相对误差降低了32％。

著录项

来源
《電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication》 |2004年第538期|共6页
作者
Yoshiyuki UMEDA; Satoru TSUGE; Fuji REN; Shingo KUROIWA;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类通信;
关键词
Distributed Speaker Recognition; Speaker identification; Non-parametric; Earth Mover's Distance;

机译：分布式说话人识别;说话人识别;非参数;地球移动距离;

相似文献

外文文献
中文文献
专利

1. Speaker Recognition using a Non-parametric Speaker Model Representation and Earth Mover's Distance [J] . Yoshiyuki UMEDA, Satoru TSUGE, Fuji REN, 電子情報通信学会技術研究報告. 言語理解とコミュニケーション. Natural Language Understanding and Models of Communication . 2004,第538期

机译：使用非参数说话人模型表示和行人距离的说话人识别
2. Nonparametric Speaker Recognition Method Using Earth Mover's Distance [J] . Shingo KUROIWA, Yoshiyuki UMEDA, Satoru TSUGE, IEICE Transactions on Information and Systems . 2006,第3期

机译：基于地球移动器距离的非参数说话人识别方法
3. Use of Neumann series decomposition to fit the Weighted Euclidean distance and Inner product scoring models in automatic speaker recognition [J] . Djellab Mourad, Mehallegue Noureddine, Achi Amar Pattern recognition letters . 2019,第JULa期

机译：在自动说话人识别中使用Neumann级数分解法来拟合加权欧几里得距离和内积评分模型
4. Distributed Speaker Recognition using Earth Mover's Distance [C] . Yoshiyuki UMEDA, Satoru TSUGE, Fuji REN, International Conference on Spoken Language Processing; 20041004-08; Jeju(KR) . 2004

机译：使用地球移动器的距离进行分布式说话人识别
5. Neural Network Based Representation Learning and Modeling for Speech and Speaker Recognition [D] . Guo, Jinxi. 2019

机译：基于神经网络的语言和扬声器识别的模拟
6. Modelling saliency attention to predict eye direction by topological structure and earth mover’s distance [O] . Longsheng Wei, Jian Peng, Wei Liu, 2011

机译：对显着性注意力进行建模，以通过拓扑结构和推土机的距离预测眼睛的方向
7. FAST AND ROBUST SPEAKER CLUSTERING USING THE EARTH MOVER’S DISTANCE AND MIXMAX MODELS [O] . Thilo Stadelmann 2008

机译：使用地球移动距离和mIXmaX模型进行快速且稳健的扬声器聚类

Speaker Recognition using a Non-parametric Speaker Model Representation and Earth Mover's Distance

摘要

著录项

相似文献

相关主题

期刊订阅