Joint Factor Analysis for Speaker Recognition reinterpreted as Signal Coding using Overcomplete Dictionaries

机译：使用超完备字典将说话人识别的联合因素分析重新解释为信号编码

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a reinterpretation of Joint Factor Analysis as a signal approximation methodology—based on ridge regression—using an overcomplete dictionary learned from data. A non-probabilistic perspective of the three fundamental steps in the JFA paradigm based on point estimates is provided. That is, model training, hyperparameter estimation and scoring stages are equated to signal coding, dictionary learning and similarity computation respectively. Establishing a connection between these two well-researched areas opens the doors for cross-pollination between both fields. As an example of this, we propose two novel ideas that arise naturally form the non-probabilistic perspective and result in faster hyperparameter estimation and improved scoring. Specifically, the proposed technique for hyperparameter estimation avoids the need to use explicit matrix inversions in the M-step of the ML estimation. This allows the use of faster techniques such as Gauss-Seidel or Cholesky factorizations for the computation of the posterior means of the factors x, y and z during the E-step. Regarding the scoring, a similarity measure based on a normalized inner product is proposed and shown to outperform the state-of-the-art linear scoring approach commonly used in JFA. Experimental validation of these two novel techniques is presented using closed-set identification and speaker verification experiments over the Switchboard database.

机译：本文提出了联合因子分析的重新解释，它是基于岭回归的信号近似方法，它使用了从数据中学到的过完整的字典。提供了基于点估计的JFA范例中三个基本步骤的非概率性观点。也就是说，模型训练，超参数估计和评分阶段分别等于信号编码，字典学习和相似度计算。在这两个经过深入研究的领域之间建立联系为这两个领域之间的异花授粉打开了大门。以此为例，我们提出了两种新颖的想法，它们自然地从非概率的角度出现，并导致更快的超参数估计和改进的评分。特别地，所提出的用于超参数估计的技术避免了在ML估计的M步中使用显式矩阵求逆的需要。这允许使用诸如Gauss-Seidel或Cholesky分解等更快的技术来计算E步中因子x，y和z的后均值。关于计分，提出了一种基于归一化内积的相似性度量，该度量显示出优于JFA中常用的最新线性计分方法。这两种新颖技术的实验验证是通过在Switchboard数据库上进行的封闭式识别和说话人验证实验来进行的。

著录项

来源
《Odyssey 2010: the speaker and language recognition workshop》|2010年|p.125-132|共8页
会议地点 Brno(CS)
作者
Daniel Garcia-Romero; Carol Y. Espy- Wilson;
展开▼
作者单位

Department of Electrical and Computer Engineering, University of Maryland, College Park, MD;

Department of Electrical and Computer Engineering, University of Maryland, College Park, MD;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类语音信号处理;
关键词
入库时间 2022-08-26 14:12:03

相似文献

外文文献
中文文献
专利

1. Joint speaker separation and recognition using non-negative matrix deconvolution with adaptive dictionary [J] . Szymon Drgas, Tuomas Virtanen Computer speech and language . 2021,第Nova期

机译：使用非负面矩阵对自适应词典的联合扬声器分离和识别
2. Robust Speaker Verification With Joint Sparse Coding Over Learned Dictionaries [J] . Haris B.C., Sinha R. Information Forensics and Security, IEEE Transactions on . 2015,第10期

机译：通过对学习词典的联合稀疏编码进行可靠的说话人验证
3. Joint Factor Analysis Versus Eigenchannels in Speaker Recognition [J] . Kenny P., Boulianne G., Ouellet P., IEEE transactions on audio, speech and language processing . 2007,第4期

机译：说话人识别中的联合因素分析与特征通道
4. A joint factor analysis model for handling mismatched recording conditions in forensic automatic speaker recognition [C] . Moreno Victor Alonso, Drygajlo Andrzej Biometrics (ICB), 2012 5th IAPR International Conference on . 2012

机译：用于处理法医自动说话人识别中不匹配记录条件的联合因素分析模型
5. Reducing Covariate Factors of Gait Recognition Using Feature Selection, Dictionary-Based Sparse Coding, and Deep Learning. [D] . Alotaibi, Munif. 2017

机译：使用特征选择，基于字典的稀疏编码和深度学习减少步态识别的协变量因素。
6. Learning Dictionaries of Sparse Codes of 3D Movements of Body Joints for Real-Time Human Activity Understanding [O] . Jin Qi, Zhiyong Yang -1

机译：学习字典了解人体活动的实时3D动作稀疏代码
7. COMPARISON OF SCORING METHODS USED IN SPEAKER RECOGNITION WITH JOINT FACTOR ANALYSIS [O] . Najim Dehak, Patrick Kenny 2015

机译：用联合因子分析法在声音识别中使用的评分方法比较
8. Separation of Undersampled Composite Signals Using the Dantzig Selector with Overcomplete Dictionaries. [R] . Prater, A., Shen, L. 2014

机译：使用具有过完备字典的Dantzig选择器分离欠采样复合信号。

Joint Factor Analysis for Speaker Recognition reinterpreted as Signal Coding using Overcomplete Dictionaries

摘要

著录项

相似文献

相关主题

期刊订阅