首页> 外文会议>European Signal Processing Conference >Test token driven acoustic balancing for sparse enrollment data in cohort GMM speaker recognition
【24h】

Test token driven acoustic balancing for sparse enrollment data in cohort GMM speaker recognition

机译:COHORT GMM扬声器识别中稀疏注册数据的测试令牌驱动声学平衡

获取原文
获取外文期刊封面目录资料

摘要

In this study, we address the problem of sparse train/test data for in-set/out-of-set speaker recognition. Sparse enrollment data presents a unique challenge due to a lack of acoustic space coverage. The proposed algorithm focuses on filling acoustic holes and fortifying the acoustic information using the claimed speaker's test token histogram. This scheme is possible by using a GMM model to classify the speaker phone information at the feature level. Parallel GMM training with EM using the most occurring (top) and least occurring (bottom) acoustic feature is called “Top-Down Bottom-Up (TDBU)”, and the method employing the acoustic token histogram of test token using the TDBU is called “TDBU using Test Token Histogram (TTH)”. Since TTH provides test data histogram information, the most occurred (top) parts in test data fortify the its discriminating ability using same acoustic tokens in enrollment data. The less occurred (bottom) part in test data provide acoustic hole information so that the mismatched acoustic hole between enrollment and test data can be filled in chance. The TDBU-TTH method is evaluated using telephone conversation speech from the FISHER corpus with 5 second train sets. The TDBU-TTH improves on average 2.17% absolute EER over the TDBU, and an average 4.03% absolute EER improvement over GMM-UBM baseline using 2 second test data. The proposed algorithm improvement is a noteworthy stage to compensate for both sparse enrollment data and limited test data.
机译:在这项研究中,我们解决了用于内置/拆卸扬声器识别的稀疏列车/测试数据的问题。由于缺乏声学空间覆盖,稀疏的注册数据具有独特的挑战。所提出的算法专注于填充声学孔并使用所称扬声器的测试令牌直方图强化声学信息。通过使用GMM模型可以在特征级别对扬声器电话信息进行分类来实现该方案。使用最多发生的(顶部)和最少发生(底部)声学特征的并行GMM培训称为“自上而下的自下而上(TDBU)”,并调用采用TDBU的声学令牌直方图的方法“TDBU使用测试令牌直方图(Tth)”。由于Tth提供测试数据直方图信息,因此测试数据中最多发生的(顶部)部分强化其在注册数据中使用相同声学令牌的辨别能力。测试数据中发生的较少(底部)部分提供声孔信息,使得入学和测试数据之间的错配声孔可以偶然填充。使用来自Fisher Corpus的电话交谈演讲,使用5秒的火车套来评估TDBU-TTH方法。 TDBU-TTH在TDBU上平均为2.17%的绝对亮度,使用2秒测试数据平均超过GMM-UBM基线的4.03%的绝对EER改进。所提出的算法改进是值得注意的阶段,可以补偿稀疏的注册数据和有限的测试数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号