首页> 外文会议>ACM international conference on Multimedia >Unfolding speaker clustering potential
【24h】

Unfolding speaker clustering potential

机译:展开扬声器聚类潜力

获取原文

摘要

Speaker clustering is the task of grouping a set of speech utterances into speaker-specific classes. The basic techniques for solving this task are similar to those used for speaker verification and identification. The hypothesis of this paper is that the techniques originally developed for speaker verification and identification are not sufficiently discriminative for speaker clustering. However, the processing chain for speaker clustering is quite large - there are many potential areas for improvement. The question is: where should improvements be made to improve the final result? To answer this question, this paper takes a biomimetic approach based on a study with human participants acting as an automatic speaker clustering system. Our findings are twofold: it is the stage of modeling that has the highest potential, and information with respect to the temporal succession of frames is crucially missing. Experimental results with our implementation of a speaker clustering systemincorporating our findings and applying it on TIMIT data show the validity of our approach.
机译:扬声器聚类是将一组语音发言分组成特定演讲的类的任务。解决此任务的基本技术与用于扬声器验证和识别的基本技术类似。本文的假设是,最初为扬声器验证和识别开发的技术对扬声器聚类没有充分判别。然而,扬声器聚类的加工链非常大 - 有许多潜在的改进区域。问题是:在哪里应该改进改善最终结果?为了回答这个问题,本文采用了一种基于与人类参与者作为自动扬声器聚类系统的研究的仿生方法。我们的研究结果是双重的:它是具有最高电位的建模阶段,以及关于框架的时间顺序的信息令人遗憾。实验结果随着我们在扬声器聚类系统的实施我们的调查结果并将其应用于Timit数据,显示了我们方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号