首页> 外文会议>European Signal Processing Conference >A MULTIMODAL APPROACH TO INITIALISATION FOR TOP-DOWN SPEAKER DIARIZATION OF TELEVISION SHOWS
【24h】

A MULTIMODAL APPROACH TO INITIALISATION FOR TOP-DOWN SPEAKER DIARIZATION OF TELEVISION SHOWS

机译:高压扬声器日复速度初探的多模式方法

获取原文

摘要

This paper presents a new multimodal approach to speaker diarization of TV show data. We hypothesize that the intra-speaker variation in visual information might be less than that in the corresponding acoustic information and therefore might be better suited to the task of speaker model initialisation. This is an acknowledged weakness of the computationally efficient top-down approach to speaker diarization that is used here. Experimental results show that a recently proposed approach to purification and the new multimodal approach to initialisation together deliver 22% and 17% relative improvements in diarization performance over the baseline system on independent development and evaluation datasets respectively.
机译:本文介绍了电视节目数据的扬声器日复速度的新多模型方法。我们假设视觉信息中的扬声器内变化可能小于相应的声学信息中的扬声器变化,因此可能更适合于扬声器模型初始化的任务。这是在此处使用的扬声器日益降压的计算有效的自上而下方法的承认弱点。实验结果表明,最近提出的净化方法和初始化的新型多式联法方法分别在独立开发和评估数据集上的基线系统中的日复速度绩效相对提高了22%和17%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号