首页> 外文会议>European Signal Processing Conference >A MULTIMODAL APPROACH TO INITIALISATION FOR TOP-DOWN SPEAKER DIARIZATION OF TELEVISION SHOWS

【24h】

A MULTIMODAL APPROACH TO INITIALISATION FOR TOP-DOWN SPEAKER DIARIZATION OF TELEVISION SHOWS

机译：高压扬声器日复速度初探的多模式方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a new multimodal approach to speaker diarization of TV show data. We hypothesize that the intra-speaker variation in visual information might be less than that in the corresponding acoustic information and therefore might be better suited to the task of speaker model initialisation. This is an acknowledged weakness of the computationally efficient top-down approach to speaker diarization that is used here. Experimental results show that a recently proposed approach to purification and the new multimodal approach to initialisation together deliver 22% and 17% relative improvements in diarization performance over the baseline system on independent development and evaluation datasets respectively.

机译：本文介绍了电视节目数据的扬声器日复速度的新多模型方法。我们假设视觉信息中的扬声器内变化可能小于相应的声学信息中的扬声器变化，因此可能更适合于扬声器模型初始化的任务。这是在此处使用的扬声器日益降压的计算有效的自上而下方法的承认弱点。实验结果表明，最近提出的净化方法和初始化的新型多式联法方法分别在独立开发和评估数据集上的基线系统中的日复速度绩效相对提高了22％和17％。

著录项

来源
《European Signal Processing Conference》|2010年||共5页
会议地点
作者
Simon Bozonnet; Felicien Vallet; Nicholas Evans; Slim Essid; Gael Richard; Jean Carrive;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN911.7-53;
关键词

相似文献

外文文献
中文文献
专利

1. A Multimodal Approach to Speaker Diarization on TV Talk-Shows [J] . Vallet F., Essid S., Carrive J. Multimedia, IEEE Transactions on . 2013,第3期

机译：电视脱口秀中说话人差异化的一种多模式方法
2. A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization [J] . Evans N., Bozonnet S., Dong Wang, Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第2期

机译：自下而上和自上而下的说话人差异化方法的比较研究
3. Multimodal speaker diarization for meetings using volume-evaluated SRP-PHAT and video analysis [J] . Cabanas-Molero P., Lucena M., Fuertes J. M., Multimedia Tools and Applications . 2018,第20期

机译：使用音量评估的SRP-PHAT和视频分析为会议提供多峰发言人二分法
4. A multimodal approach to initialisation for top-down speaker diarization of television shows [C] . Bozonnet Simon, Vallet Felicien, Evans Nicholas, European Signal Processing Conference . 2010

机译：高压扬声器日复速度初探的多模式方法
5. Automatic Speaker Recognition and Diarization in Co-Channel Speech [D] . Shokouhi, Navid. 2017

机译：同频道语音中的说话人自动识别和区分
6. Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model [O] . Rehan Ahmad, Syed Zubair, Hani Alquhayz, 2019

机译：使用预训练的视听同步模型进行多模态扬声器二分法
7. A multimodal approach to initialisation for top-down speaker diarization of television shows [O] . Bozonnet Simon, Carrive Jean, Essid Slim, 2010

机译：电视节目自上而下扬声器二值化的多模式初始化方法
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

A MULTIMODAL APPROACH TO INITIALISATION FOR TOP-DOWN SPEAKER DIARIZATION OF TELEVISION SHOWS

摘要

著录项

相似文献

相关主题

期刊订阅