...
【24h】

Speaker Diarization and Linking of Meeting Data

机译:演讲者区分和会议数据链接

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Finding who spoke when in a collection of recordings, with speakers being uniquely identified across the database, is a challenging task. In this scenario, reasonable computing times and acoustic variation across recordings remain two major concerns to address in state-of-the-art speaker diarization systems. This paper extends prior work on diarizing large speech datasets using algorithms that scale well with increasing amounts of data while compensating for across-recording variability. We follow a two-stage approach performing speaker diarization and speaker linking, the former focusing on local within-recording speaker changes and the latter focusing on global speaker changes across the database. In this study, we explore how these two modules interact with each other, while proposing a diarization fusion approach that prevents diarization errors from propagating to the linking stage. We further explore the diarization fusion for speaker linking using different linking strategies and speaker modeling variants. Evaluation is performed on single distant microphone data from the augmented multiparty interaction corpus show the effectiveness of the fusion approach after speaker linking and intersession variability modeling via joint factor analysis.
机译:在数据库中唯一地确定说话者的情况下,查找记录集合中的讲话者是一项艰巨的任务。在这种情况下,合理的计算时间和整个录音的声音变化仍然是当前最先进的扬声器分离系统要解决的两个主要问题。本文扩展了使用算法对大型语音数据集进行数字化的现有工作,该算法可随着数据量的增加而很好地扩展,同时补偿跨记录的可变性。我们遵循两阶段方法来执行说话者区分和说话者链接,前者着重于本地内部记录的说话者变化,而后者着重于整个数据库中的全局说话者变化。在这项研究中,我们探索了这两个模块之间如何相互作用,同时提出了一种防止融合误差传播到链接阶段的差分融合方法。我们将进一步探讨使用不同链接策略和说话人建模变体进行说话人链接的差分融合。对来自增强型多方互动语料库的单个远距离麦克风数据进行评估,显示了说话者链接和会话间可变性建模(通过联合因子分析)后融合方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号