A fast and scalable hybrid FA/PPCA-based framework for speaker recognition

Srikanth R.Madikeri

首页> 外文期刊>Digital Signal Processing >A fast and scalable hybrid FA/PPCA-based framework for speaker recognition

【24h】

A fast and scalable hybrid FA/PPCA-based framework for speaker recognition

机译：快速，可扩展的基于FA / PPCA的混合框架，用于说话人识别

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

A text-independent speaker recognition system using a hybrid Probabilistic Principal Component Analysis (PPCA) and conventional i-vector modelingtechnique is proposed. In this framework, the total variability space (TVS) is estimated using PPCA while the i-vectors of target speakers and test utterances are extracted using the conventional method. This leads to appreciable decrease in development time, while the time required for training and testing remains unchanged. In this a paper, an algorithmic optimization to the PPCA’s EM algorithm is developed. This is observed to provide a speed up of 3.7×. To simplify the testing procedure, two different approximation procedures are proposed to be used in this framework. The first approximation assumes a covariance matrix computed based on the PPCA framework. Thesecond approximation proposes an optimization to avoid inverting the precision matrix of the i-vector. The comparison of time taken by these approximations with the baseline i-vector extraction procedure showsspeed gains with some deterioration in performance in terms of the Equal Error Rate(EER). Among the proposed techniques, a best case trade-off is obtained with a speed up of 81.2× with deterioration in performance by0.7%in absolute terms. Speaker recognition performances are studied on the telephone conditions of the benchmark NIST SRE 2010 dataset with systems built on the Mel Frequency Cepstral Co-efficient (MFCC) feature. A trade-off in the performance is observed when the proposed approximations are used. The scalability of these trade-offs istested on the Mel Filterbank Slope (MFS) feature. The trade-offs observed with the approximations are reduced when the two systems are fused.

机译：提出了一种使用混合概率主成分分析（PPCA）和传统的i矢量建模技术的文本无关的说话人识别系统。在此框架中，使用PPCA估计总可变性空间（TVS），而使用常规方法提取目标说话者的i矢量和测试话语。这导致开发时间显着减少，而培训和测试所需的时间保持不变。本文针对PPCA的EM算法进行了算法优化。观察到这提供了3.7倍的加速。为了简化测试程序，建议在此框架中使用两种不同的近似程序。第一近似假定基于PPCA框架计算的协方差矩阵。第二种近似提出了一种优化方案，以避免对i向量的精度矩阵求逆。这些近似值所花费的时间与基线i向量提取过程的比较表明，在等错误率（EER）方面，速度获得了提升，但性能却有所下降。在所提出的技术中，以81.2倍的速度获得了最佳情况的权衡，而性能绝对值下降了0.7％。使用基于梅尔频率倒谱系数（MFCC）功能构建的系统，在基准NIST SRE 2010数据集的电话条件下研究说话者识别性能。当使用建议的近似值时，会观察到性能的折衷。这些权衡的可伸缩性已在Mel Filterbank斜率（MFS）功能上进行了测试。当两个系统融合时，在近似中观察到的取舍减少了。

著录项

来源
《Digital Signal Processing》 |2014年第null期|共9页
作者
Srikanth R.Madikeri;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类数字信号处理;
关键词
Speaker recognition; i-vectors; PPCA; Fast i-vector extraction;

机译：说话人识别;i向量;PPCA;快速i向量提取;

相似文献

外文文献
中文文献
专利

1. A fast and scalable hybrid FA/PPCA-based framework for speaker recognition [J] . Srikanth R.Madikeri Digital Signal Processing . 2014,第Null期

机译：快速，可扩展的基于FA / PPCA的混合框架，用于说话人识别
2. Installed Performance Modeling of Complex Antenna Array Mounted on Extremely Large-Scale Platform Using Fast MoM-PO Hybrid Framework [J] . Liu Z.-L., Wang X., Wang C.-F. Antennas and Propagation, IEEE Transactions on . 2014,第7期

机译：使用快速MoM-PO混合框架的超大型平台上安装的复杂天线阵列的已安装性能建模
3. FastUDP: a highly scalable user-level UDP framework in multi-core systems for fast packet I/O [J] . Zhang Hongjun, Zhang Heng, Zhang Libo, Journal of supercomputing . 2021,第5期

机译：Fastudp：用于快速数据包I / O的多核系统中的高度可扩展的用户级UDP框架
4. Fast speaker adaptation of hybrid NN/HMM model for speech recognition based on discriminative learning of speaker code [C] . Abdel-Hamid Ossama, Jiang Hui IEEE International Conference on Acoustics, Speech and Signal Processing . 2013

机译：基于说话人代码判别学习的NN / HMM混合模型快速说话人自适应
5. A framework for a fast fingerprint identification using a hybrid system. [D] . Huvanandana, Sanpachai. 2002

机译：使用混合系统进行快速指纹识别的框架。
6. Correction: A Hybrid CPU-GPU Accelerated Framework for Fast Mapping of High-Resolution Human Brain Connectome [O] . Yu Wang, Haixiao Du, Mingrui Xia, -1

机译：更正：用于快速映射高分辨率人脑Connectome的混合CPU-GPU加速框架
7. Probabilistic scoring using decision trees for fast and scalable speaker recognition [O] . Gonon, Gilles, Bimbot, Frédéric, Gribonval, Rémi 2009

机译：使用决策树进行概率评分以实现快速，可扩展的说话人识别

A fast and scalable hybrid FA/PPCA-based framework for speaker recognition

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅