首页> 外文会议>2017 IEEE Automatic Speech Recognition and Understanding Workshop >Multi-view (Joint) probability linear discrimination analysis for J-vector based text dependent speaker verification
【24h】

Multi-view (Joint) probability linear discrimination analysis for J-vector based text dependent speaker verification

机译:基于J向量的文本相关说话人验证的多视图(联合)概率线性判别分析

获取原文
获取原文并翻译 | 示例

摘要

J-vector has been proved to be very effective in text dependent speaker verification with short-duration speech. However, the current back-end classifiers cannot make full use of such deep features. In this paper, we propose a method to model the multi-faceted information in the j-vector explicitly and jointly. Examples of the multi-faceted information include speaker identity and text content. In our approach, the j-vector was modeled as a result derived by a generative multi-view (joint1) Probability Linear Discriminant Analysis (PLDA) model, which contains multiple kinds of latent variables. The usual PLDA model only considers one single label. However, in practical use, when using multi-task learned network as feature extractor, the extracted feature are always associated with several labels. This type of feature is called multi-view deep feature (e.g. j-vector). With multi-view (joint) PLDA, we are able to explicitly build a model that can combine multiple heterogeneous information from the j-vectors. In verification step, we calculated the likelihood to describe whether the two j-vectors having consistent labels or not. This likelihood is used in the following decision-making. Experiments have been conducted on large scale data corpus of different languages. On the public RSR2015 data corpus, the results showed that our approach can achieve 0.02% EER and 0.09% EER for impostor wrong and impostor correct cases respectively.
机译:事实证明,J向量在短时语音的文本相关说话人验证中非常有效。但是,当前的后端分类器无法充分利用这些深层功能。在本文中,我们提出了一种显式联合地对j向量中的多面信息进行建模的方法。多方面信息的示例包括说话者身份和文本内容。在我们的方法中,将j矢量建模为生成多视图(joint 1 )概率线性判别分析(PLDA)模型的结果,该模型包含多种潜在变量。通常的PLDA模型仅考虑一个标签。然而,在实际使用中,当使用多任务学习网络作为特征提取器时,提取的特征总是与多个标签相关联。这种类型的特征称为多视图深度特征(例如j-vector)。使用多视图(联合)PLDA,我们能够显式构建一个模型,该模型可以组合来自j矢量的多个异构信息。在验证步骤中,我们计算了描述两个j向量是否具有一致标记的可能性。这种可能性在以下决策中使用。已经对不同语言的大规模数据语料库进行了实验。在公开的RSR2015数据语料库上,结果表明,我们的方法对于冒名顶替者错误和冒名顶替者正确案例分别可以实现0.02%的EER和0.09%的EER。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号