Speaker diarization of meetings based on speaker role n-gram models

机译：基于演讲者角色n-gram模型的会议演讲者二值化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speaker diarization of meeting recordings is generally based on acoustic information ignoring that meetings are instances of conversations. Several recent works have shown that the sequence of speakers in a conversation and their roles are related and statistically predictable. This paper proposes the use of speaker roles n-gram model to capture the conversation patterns probability and investigates its use as prior information into a state-of-the-art diarization system. Experiments are run on the AMI corpus annotated in terms of roles. The proposed technique reduces the diarization speaker error by 19% when the roles are known and by 17% when they are estimated. Furthermore the paper investigates how the n-gram models generalize to different settings like those from the Rich Transcription campaigns. Experiments on 17 meetings reveal that the speaker error can be reduced by 12% also in this case thus the n-gram can generalize across corpora.

机译：会议录音的说话人二分法通常是基于声音信息，而忽略了会议是对话的实例。最近的几项研究表明，对话中说话者的顺序及其角色是相关的，并且在统计上是可预测的。本文提出了使用说话者角色n-gram模型来捕获会话模式概率的方法，并研究了其作为先验信息的用途，并将其应用到了最新的差分系统中。实验在AMI语料库上进行了角色标注。所提出的技术在已知角色时将离散说话者错误减少19％，在估计角色时将减少17％。此外，本文还研究了n-gram模型如何推广到不同的设置，例如Rich Transcription运动中的设置。在17次会议上进行的实验表明，在这种情况下，说话人错误也可以减少12％，因此n-gram可以在整个语料库中推广。

著录项

来源
《2011 IEEE International Conference on Acoustics, Speech and Signal Processing》|2011年|p.4416-4419|共4页
会议地点
作者
Valente Fabio; Vijayasenan Deepu; Motlicek Petr;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信理论;
关键词
Speaker Roles; Speaker diarization; Viterbi decoding; meeting recordings; multi-party conversations;

机译：演讲者角色;演讲者二元化;维特比解码;会议录音;多方对话;

相似文献

外文文献
中文文献
专利

1. Generalized Viterbi-based models for time-series segmentation and clustering applied to speaker diarization [J] . Itshak Lapidot, Alon Shoa, Tal Furmanov, Computer speech and language . 2017,第Sepa期

机译：基于通用维特比的时间序列分割和聚类模型，用于说话人区分
2. Multimodal speaker diarization for meetings using volume-evaluated SRP-PHAT and video analysis [J] . Cabanas-Molero P., Lucena M., Fuertes J. M., Multimedia Tools and Applications . 2018,第20期

机译：使用音量评估的SRP-PHAT和视频分析为会议提供多峰发言人二分法
3. Speaker Diarization and Linking of Meeting Data [J] . Marc Ferràs, Srikanth Madikeri, Hervé Bourlard Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第11期

机译：演讲者区分和会议数据链接
4. SPEAKER DIARIZATION OF MEETINGS BASED ON SPEAKER ROLE N-GRAM MODELS [C] . Fabio Valente, Deepu Vijayasenan, Petr Motlicek IEEE International Conference on Acoustics, Speech and Signal Processing . 2011

机译：基于扬声器角色N-GRAM模型的会议扬声器日益改复
5. Use of speaker location features in meeting diarization. [D] . Otterson, Scott. 2008

机译：会议发言者使用语音定位功能。
6. Multimodal Speaker Diarization Using a Pre-Trained Audio-Visual Synchronization Model [O] . Rehan Ahmad, Syed Zubair, Hani Alquhayz, 2019

机译：使用预训练的视听同步模型进行多模态扬声器二分法
7. SPEAKER DIARIZATION OF MEETINGS BASED ON SPEAKER ROLE N-GRAM MODELS [O] . Fabio Valente, Deepu Vijayasenan, Petr Motlicek 2015

机译：基于扬声器角度N-GRam模型的会议扬声器演示
8. Speaker Adaptation of Language Models for Automatic Dialog Act Segmentation of Meetings [R] . Kolar, J. , Liu, Y. , Shriberg, E. 2007

机译：会议自动对话行为分割的语言模型演讲者自适应

Speaker diarization of meetings based on speaker role n-gram models

摘要

著录项

相似文献

相关主题

期刊订阅