Unsupervised classification of speaker roles in multi-participant conversational speech

Yanxiong Li; Qin Wang; Xue Zhang; Wei Li; Xinchao Li; Jichen Yang; Xiaohui Feng; Qian Huang; Qianhua He

首页> 外文期刊>Computer speech and language >Unsupervised classification of speaker roles in multi-participant conversational speech

【24h】

Unsupervised classification of speaker roles in multi-participant conversational speech

机译：多人对话语音中说话人角色的无监督分类

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper proposes an unsupervised method for analyzing speaker roles in multi-participant conversational speech. First, features for characterizing the differences of various roles are extracted from the outputs of speaker diarization. Then, an algorithm of role clustering based on the criterion of maximizing the inter-cluster distance without using any convergence threshold is proposed to obtain the number of roles and to merge the utterances belonging to the same role into one cluster. The contributions of different combinations of individual feature subsets are compared for the proposed method on the outputs from speaker diarization, and the combined feature subsets obtain higher F scores than the individual ones for clustering speaker roles. The impacts of both speaker diarization errors and feature dimensions on the performance of the proposed method are also discussed. Experiments are done on the outputs of both manual annotations and automatic speaker diarization to compare the proposed method with both the state-of-the-art clustering method and the supervised method. Evaluations show that the proposed method is superior to the previous clustering method and close to the conventional supervised method in terms of F scores under two different experimental conditions.

机译：本文提出了一种无监督的方法来分析多方对话语音中的说话者角色。首先，从说话者二值化的输出中提取表征各种角色差异的特征。然后，提出了一种基于最大化聚类间距离而不使用任何收敛阈值的准则的角色聚类算法，以获取角色数量并将属于同一角色的话语合并为一个聚类。针对说话人二分法的输出，比较了所提出方法的单个特征子集的不同组合的贡献，并且对于聚类的讲话者角色，组合的特征子集获得的F得分高于单个特征子集。还讨论了说话人区分误差和特征尺寸对所提方法性能的影响。对人工注释和自动说话人区分的输出进行了实验，以将所提出的方法与最新的聚类方法和监督方法进行比较。评估表明，在两种不同的实验条件下，所提出的方法在F评分方面优于先前的聚类方法，并且接近于传统的监督方法。

著录项

来源
《Computer speech and language》 |2017年第3期|81-99|共19页
作者
Yanxiong Li; Qin Wang; Xue Zhang; Wei Li; Xinchao Li; Jichen Yang; Xiaohui Feng; Qian Huang; Qianhua He;
展开▼
作者单位

School of Electronic and Information Engineering, South China University of Technology, Room 223, Shaw Science Building, 381 Wushan Road, Guangzhou, China;

School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou, China;

School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou, China;

School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou, China;

School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou, China;

School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou, China;

School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou, China;

School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou, China;

School of Electronic and Information Engineering, South China University of Technology, 381 Wushan Road, Guangzhou, China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Speaker role; Speaker diarization; Role clustering; Multi-participant conversational speech;

机译：演讲者角色;说话人差异化;角色聚类;多人对话;

相似文献

外文文献
中文文献
专利

1. Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis [J] . John Dines, Hui Liang, Lakshmi Saheer, Computer speech and language . 2013,第2期

机译：个性化语音到语音翻译：基于HMM的语音合成的无监督跨语言说话者自适应
2. Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition [J] . Tetsuo KOSAKA, Yuui TAKEDA, Takashi ITO, IEICE transactions on information and systems . 2010,第9期

机译：使用演讲者级模型的演讲者语音识别的无监督演讲者自适应
3. Unsupervised Speaker Adaptation Using Speaker-Class Models for Lecture Speech Recognition [J] . Tetsuo KOSAKA, Yuui TAKEDA, Takashi ITO, IEICE Transactions on Information and Systems . 2010,第9期

机译：演讲者语音识别的无监督演讲者自适应模型
4. A Speaker Based Unsupervised Speech Segmentation Algorithm Used in Conversational Speech [C] . Yanxiang Chen, Qiong Wang International Conference on Knowledge Science, Engineering and Management(KSEM 2007); 20071128-30; Melbourne(AU) . 2007

机译：会话语音中基于说话人的无监督语音分割算法
5. The role of feedback in speech motor learning: Insights from healthy speakers and applications to the treatment of apraxia of speech. [D] . Austermann Hula, Shannon Noelle. 2008

机译：反馈在言语运动学习中的作用：健康说话者的见解及其在言语失用症治疗中的应用。
6. Hybridizing Conversational and Clear Speech to Investigate the Source of Increased Intelligibility in Speakers With Parkinson’s Disease [O] . Kris Tjaden, Alexander Kain, Jennifer Lam -1

机译：通过对话和清晰语音的混合来研究帕金森氏病患者口语清晰度提高的原因
7. Theme in conversational discourse : problems experienced by speakers of Black South African English, with particular reference to the role of prosody in conversational synchrony [O] . Gennrich-de Lisle Daniela 2013

机译：对话话语中的主题：黑人南非英语的发言者遇到的问题，特别是韵律在对话同步中的作用
8. Conversational Telephone Speech Corpus Collection for the NIST Speaker Recognition Evaluation 2004 [R] . Martin, A., Miller, D., Przybocki, M., 2004

机译：2004年NIsT演讲者认可评估的会话电话语音语料库集

Unsupervised classification of speaker roles in multi-participant conversational speech

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅