Mahalanobis based emission model for speaker diarization of telephone conversations

机译：基于Mahalanobis的电话交谈扬声器日复速度的发射模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The primary objective of any speaker diarization system is to designate speech segments to one of K speakers in the conversation. In this work we will focus on telephone conversations, where the number of speakers is given and equal 2. We use a hidden-distortion-model (HDM)-based system. HDM allows using different emission models as speaker models. The choice of adequate emission models, properly representing the data characteristics is important for the systems' performance. We investigate the effect of several codebooks (CBs) based emission models, with Euclidian and Mahalanobis distances. The Mahalanobis distance was chosen due its potential to produce a better representation of the data's spatial layout, while limitations where maid to retain the model from divergence. The influence of the different methods is evaluated using 108 telephone conversations taken from the LDC CallHome corpus. All the experiments achieved results poorer than the original SOM-based system (DER=12.70%).

机译：任何扬声器日记系统的主要目标是将语音段指定为谈话中的K扬声器之一。在这项工作中，我们将专注于电话对话，其中给出扬声器的数量和等于2.我们使用隐藏的失真模型（HDM）的系统。 HDM允许使用不同的发射模型作为扬声器模型。适当的发射模型的选择，代表数据特征对于系统的性能很重要。我们调查了基于欧几里德和Mahalanobis距离的若干码本（CBS）的发射模型对若干码本（CBS）的发射模型的影响。选择Mahalanobis距离由于其可能产生更好的数据的空间布局表示，而MAID将从分歧中保留模型的限制。使用从LDC Callhome语料库中获取的108个电话对话来评估不同方法的影响。所有实验所达到的结果比原始SOM的系统更差（Der = 12.70％）。

著录项

来源
《IEEE Convention of Electrical and Electronics Engineers in Israel》|2014年||共5页
会议地点
作者
Furmanov Tal; Aminov Lidiya; Moyal Ami; Lapidot Itshak;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化基础理论;
关键词
mobile radio; principal component analysis; speaker recognition; Euclidian distances; HDM based system; K speakers; LDC CallHome corpus; Mahalanobis based emission model; Mahalanobis distances; data characteristics; hidden distortion-model; several codebooks; spatial layout; speaker diarization; speaker models; speech segment designation; telephone conversations; Covariance matrices; Density estimation robust algorithm; Hidden Markov models; Speech; Standards; Training; Vectors; Hidden-distortion model (HDM); K-means; Mahalanobis distance; self-organizing maps (SOM); speaker diarization;

机译：移动无线电;主成分分析;扬声器识别;欧几里多距离;基于HDM的系统;k扬声器;LDC CallHome语料库;Mahalanobis基于Smahisse模型;Mahalanobis距离;数据特征;隐藏的失真模型;几个码本;扬声器简介;扬声器日期;扬声器模型;语音段指定;电话交谈;协方差矩阵;密度估计稳健算法;隐藏的马尔可夫模型;语音;标准;培训;训练;隐藏 - 失真模型（HDM）;k-meatht;k-means;mahalanobis距离;自我组织地图（SOM ）;扬声器日益改血;

相似文献

外文文献
中文文献
专利

1. Initialization of Iterative-Based Speaker Diarization Systems for Telephone Conversations [J] . Ben-Harush O., Ben-Harush O., Lapidot I., Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第2期

机译：电话会议基于迭代的说话人区分系统的初始化
2. Combining Gaussianized/Non-Gaussianized Features to Improve Speaker Diarization of Telephone Conversations [J] . Gupta V., Kenny P., Ouellet P., IEEE signal processing letters . 2007,第12期

机译：结合高斯化/非高斯化功能以改善电话对话中的说话人差异化
3. Generalized Viterbi-based models for time-series segmentation and clustering applied to speaker diarization [J] . Itshak Lapidot, Alon Shoa, Tal Furmanov, Computer speech and language . 2017,第Sepa期

机译：基于通用维特比的时间序列分割和聚类模型，用于说话人区分
4. Mahalanobis based emission model for speaker diarization of telephone conversations [C] . Furmanov Tal, Aminov Lidiya, Moyal Ami, 2014 IEEE 28th Convention of Electrical amp; Electronics Engineers in Israel . 2014

机译：基于马哈拉诺比斯的发射模型，用于电话对话中的说话人区分
5. Model formation and classification techniques for conversations-based speaker discrimination. [D] . Ofoegbu, Uchechukwu. 2007

机译：基于对话的说话人辨别的模型形成和分类技术。
6. Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model [O] . Rehan Ahmad, Syed Zubair, Hani Alquhayz, 2019

机译：使用预训练的视听同步模型进行多模态扬声器二分法
7. MULTIPLE FEATURE COMBINATION TO IMPROVE SPEAKER DIARIZATION OF TELEPHONE CONVERSATIONS [O] . Vishwa Gupta, Patrick Kenny, Pierre Ouellet, 2010

机译：多功能组合改善电话对话的扬声器化
8. MIT Lincoln Laboratory RT-04F Diarization Systems: Applications to Broadcast Audio and Telephone Conversations [R] . Reynolds, D. A., Torres-Carrasquillo, P. 2004

机译：麻省理工学院林肯实验室RT-04F Diarization systems：广播音频和电话对话的应用

Mahalanobis based emission model for speaker diarization of telephone conversations

摘要

著录项

相似文献

相关主题

期刊订阅