首页> 外文会议>2011 IEEE International Conference on Acoustics, Speech and Signal Processing >Speaker diarization of meetings based on speaker role n-gram models
【24h】

Speaker diarization of meetings based on speaker role n-gram models

机译:基于演讲者角色n-gram模型的会议演讲者二值化

获取原文

摘要

Speaker diarization of meeting recordings is generally based on acoustic information ignoring that meetings are instances of conversations. Several recent works have shown that the sequence of speakers in a conversation and their roles are related and statistically predictable. This paper proposes the use of speaker roles n-gram model to capture the conversation patterns probability and investigates its use as prior information into a state-of-the-art diarization system. Experiments are run on the AMI corpus annotated in terms of roles. The proposed technique reduces the diarization speaker error by 19% when the roles are known and by 17% when they are estimated. Furthermore the paper investigates how the n-gram models generalize to different settings like those from the Rich Transcription campaigns. Experiments on 17 meetings reveal that the speaker error can be reduced by 12% also in this case thus the n-gram can generalize across corpora.
机译:会议录音的说话人二分法通常是基于声音信息,而忽略了会议是对话的实例。最近的几项研究表明,对话中说话者的顺序及其角色是相关的,并且在统计上是可预测的。本文提出了使用说话者角色n-gram模型来捕获会话模式概率的方法,并研究了其作为先验信息的用途,并将其应用到了最新的差分系统中。实验在AMI语料库上进行了角色标注。所提出的技术在已知角色时将离散说话者错误减少19%,在估计角色时将减少17%。此外,本文还研究了n-gram模型如何推广到不同的设置,例如Rich Transcription运动中的设置。在17次会议上进行的实验表明,在这种情况下,说话人错误也可以减少12%,因此n-gram可以在整个语料库中推广。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号