首页> 外文会议>2014 IEEE 28th Convention of Electrical amp; Electronics Engineers in Israel >Mahalanobis based emission model for speaker diarization of telephone conversations
【24h】

Mahalanobis based emission model for speaker diarization of telephone conversations

机译:基于马哈拉诺比斯的发射模型,用于电话对话中的说话人区分

获取原文
获取原文并翻译 | 示例

摘要

The primary objective of any speaker diarization system is to designate speech segments to one of K speakers in the conversation. In this work we will focus on telephone conversations, where the number of speakers is given and equal 2. We use a hidden-distortion-model (HDM)-based system. HDM allows using different emission models as speaker models. The choice of adequate emission models, properly representing the data characteristics is important for the systems' performance. We investigate the effect of several codebooks (CBs) based emission models, with Euclidian and Mahalanobis distances. The Mahalanobis distance was chosen due its potential to produce a better representation of the data's spatial layout, while limitations where maid to retain the model from divergence. The influence of the different methods is evaluated using 108 telephone conversations taken from the LDC CallHome corpus. All the experiments achieved results poorer than the original SOM-based system (DER=12.70%).
机译:任何说话者区分系统的主要目标是为对话中的K个说话者之一指定语音段。在这项工作中,我们将专注于电话交谈,其中给定发言人的人数并等于2。我们使用基于隐藏失真模型(HDM)的系统。 HDM允许使用不同的发射模型作为扬声器模型。选择合适的排放模型以正确表示数据特征对于系统性能很重要。我们调查了基于欧几里得距离和马哈拉诺比斯距离的几种基于码本(CB)的排放模型的影响。选择Mahalanobis距离是因为它有潜力更好地表示数据的空间布局,但在限制模型保持发散方面存在局限性。使用来自LDC CallHome语料库的108个电话对话来评估不同方法的影响。所有实验都取得了比原始的基于SOM的系统(DER = 12.70%)差的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号