Incorporating Prior Knowledge into Speaker Diarization and Linking for Identifying Common Speaker

机译：将先验知识纳入说话人区分和链接以识别普通说话人

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speaker Diarization and Linking discovers “who spoke when” across recordings without any speaker enrollment. Diarization is performed on each recording separately, and the linking combines clusters of the same speaker across recordings. It is a two-step approach, however it suffers from propagating the error from diarization step to the linking step. In a situation where a unique speaker appears in a given set of recordings, this paper aims at locating the common speaker using the prior knowledge of his or her existence. That means there is no enrollment data for this common speaker. We propose Pairwise Common Speaker Identification (PCSI) method that takes the existence of a common speaker into account in contrast to the two-step approach. We further show that PCSI can be used to reduce the errors that are introduced in the diarization step of the two-step approach. Our experiments are performed on a corpus synthesised from the AMI corpus and also on a in-house conversational telephony Sichuanese corpus that is mixed with Mandarin. We show up to 7.68% relative improvements of time-weighted equal error rate over a state-of-art x-vector diarization and linking system.

机译：扬声器日益增估和连接“在没有任何发言者入学的记录中谈论”谁说话。在每次记录上进行日复速度，并且连接在录制中结合了同一扬声器的簇。这是一种两步的方法，但它遭受从日复速度步骤到连接步骤的误差。在一个独特的扬声器出现在一套特定的录音中的情况下，本文旨在使用他或她的存在的先验知识定位公共扬声器。这意味着这个公共扬声器没有注册数据。我们提出了成对的通用扬声器识别（PCSI）方法，其与两步方法相比，考虑了共同扬声器的存在。我们进一步表明，PCSI可用于减少两步方法的日复速度步骤中引入的错误。我们的实验是对从AMI语料库合成的语料库中进行的，也是在内部的谈话电话中，季度语料库与普通话混合。在最先进的X-载体日期和连接系统上显示出高达7.68％的相对改善时间加权等误码率的相对改善。

著录项

来源
《IEEE Automatic Speech Recognition and Understanding Workshop》|2019年|697-703|共7页
会议地点
作者
Tsun-Yat Leung; Lahiru Samarakoon; Albert Y.S. Lam;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Principal component analysis; NIST; Merging; Scalability; Bayes methods; Telephony; Error analysis;

机译：主成分分析; NIST;合并;可伸缩性;贝叶斯方法;电话;误差分析;

相似文献

外文文献
中文文献
专利

1. Speaker Diarization and Linking of Meeting Data [J] . Marc Ferràs, Srikanth Madikeri, Hervé Bourlard Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2016,第11期

机译：演讲者区分和会议数据链接
2. Analysis of Speaker Diarization Based on Bayesian HMM With Eigenvoice Priors [J] . Mireia Diez, Lukáš Burget, Federico Landini, Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2020,第期

机译：基于贝叶斯·汉姆与特征神奇前锋的扬声器日益衰退分析
3. On the use of prior knowledge in normalization schemes for speaker verification [J] . Gravier G., Chollet G., Kharroubi J. Digital Signal Processing . 2000,第1a3期

机译：关于在规范化方案中使用先验知识进行说话人验证
4. Incorporating Prior Knowledge into Speaker Diarization and Linking for Identifying Common Speaker [C] . Tsun-Yat Leung, Lahiru Samarakoon, Albert Y.S. Lam IEEE Automatic Speech Recognition and Understanding Workshop . 2019

机译：将先验知识纳入扬声器深度化并连接用于识别公共扬声器
5. Automatic Speaker Recognition and Diarization in Co-Channel Speech [D] . Shokouhi, Navid. 2017

机译：同频道语音中的说话人自动识别和区分
6. Adjustment of speaker’s referential expressions to an addressee’s likely knowledge and link with theory of mind abilities [O] . Amélie M. Achim, Marion Fossard, Sophie Couture, -1

机译：调整演讲者的推荐表达以适应收件人的可能知识并与心理能力理论联系起来
7. A Triplet Ranking-Based Neural Network for Speaker Diarization and Linking [O] . Gaël Le Lan, Delphine Charlet, Anthony Larcher, 2017

机译：基于三联排名的神经网络，用于扬声器深度和连接
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Incorporating Prior Knowledge into Speaker Diarization and Linking for Identifying Common Speaker

摘要

著录项

相似文献

相关主题

期刊订阅