首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Improving Short Utterance Speaker Recognition by Modeling Speech Unit Classes
【24h】

Improving Short Utterance Speaker Recognition by Modeling Speech Unit Classes

机译:通过对语音单元类别进行建模来提高说话者的简短识别能力

获取原文
获取原文并翻译 | 示例

摘要

Short utterance speaker recognition (SUSR) is highly challenging due to the limited enrollment and/or test data. We argue that the difficulty can be largely attributed to the mismatched prior distributions of the speech data used to train the universal background model (UBM) and those for enrollment and test. This paper presents a novel solution that distributes speech signals into a multitude of acoustic subregions that are defined by speech units, and models speakers within the subregions. To avoid data sparsity, a data-driven approach is proposed to cluster speech units into speech unit classes, based on which robust subregion models can be constructed. Further more, we propose a model synthesis approach based on maximum likelihood linear regression (MLLR) to deal with no-data speech unit classes. The experiments were conducted on a publicly available database SUD12. The results demonstrated that on a text-independent speaker recognition task where the test utterances are no longer than 2 seconds and mostly shorter than 0.5 seconds, the proposed subregion modeling offered a 21.51% relative reduction in equal error rate (EER), compared with the standard GMM-UBM baseline. In addition, with the model synthesis approach, the performance can be greatly improved in scenarios where no enrollment data are available for some speech unit classes.
机译:由于注册和/或测试数据的限制,说话人短语音识别(SUSR)极具挑战性。我们认为,困难主要归因于用于训练通用背景模型(UBM)的语音数据以及用于注册和测试的语音数据的先验分布不匹配。本文提出了一种新颖的解决方案,可将语音信号分配到由语音单元定义的多个声学子区域中,并对子区域内的扬声器进行建模。为了避免数据稀疏性,提出了一种数据驱动的方法来将语音单元聚类为语音单元类,基于此可以构建健壮的子区域模型。此外,我们提出了一种基于最大似然线性回归(MLLR)的模型综合方法来处理无数据语音单元类别。实验在公开的数据库SUD12上进行。结果表明,在独立于文本的说话人识别任务中,测试话语不超过2秒且大多数情况下是小于0.5秒,相比于EER,拟议的子区域建模提供了21.51%的相对降低。标准GMM-UBM基准。此外,使用模型综合方法,在某些语言单元类别没有注册数据的情况下,可以大大提高性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号