Speaker Diarization of Multi-party Conversations Using Participants Role Information: Political Debates and Professional Meetings

机译：使用参与者的多方对话的发言者日益改估职称：政治辩论和专业会议

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speaker Diarization aims at inferring who spoke when in an audio stream and involves two simultaneous unsupervised tasks: (1) the estimation of the number of speakers, and (2) the association of speech segments to each speaker. Most of the recent efforts in the domain have addressed the problem using machine learning techniques or statistical methods (for a review see [11]) ignoring the fact that the data consists of instances of human conversations. When humans want to use language to communicate orally with each other, they are faced to a coordination problem. "Avoidance of collision is one obvious ground for this coordination of actions between the participants. In order to coordinate efficiently and successfully, they will therefore have to agree to follow certain rules of interaction" [8]. One such rule is that no one monopolizes the floor but the participants take turns to speak. This concept is called turn-taking. The computational linguistic literature is rich on the analysis of human conversations; the seminal work of [9] shows that conversations obey to predictable interactions pattern between participants and a speaker turn is related in predictable ways to the previous and next turn and follows a structure similar to a grammar. In between the social phenomena that regulates the turns in a conversation, lot of attention has been devoted to roles. In fact people interact in different ways depending on the context of the environment but "Their interactions involve behaviors associated with defined statuses and particular roles. These statuses and roles help to pattern our social interactions and provide pre-dictability" [10].

机译：扬声器深度旨在推断在音频流时讲话的推断，涉及两个同步无监督的任务：（1）扬声器数量的估计，以及（2）语音段与每个扬声器的关联。域中最近的大多数努力都使用机器学习技术或统计方法解决了问题（审查见[11]）忽略数据由人类对话的情况组成的事实。当人类想要使用语言彼此口头通信时，它们就会面临协调问题。 “避免碰撞是参与者之间的行动协调的一个明显的理由。为了有效地协调，因此他们必须同意遵循某些互动规则”[8]。一个这样的统治是，没有人垄断地板，但参与者轮流说话。这个概念被称为转弯。计算语言文学富于人类谈话的分析; [9]的开创性工作表明，在参与者和扬声器转弯之间遵守可预测的相互作用模式的对话以可预测的方式与前一个和下一个转弯的方式相关，并遵循类似于语法的结构。在调节谈话中的转弯的社会现象之间，很多关注都致力于角色。事实上，人们根据环境的上下文，“他们的交互涉及与定义的状态和特定角色相关的行为。这些状态和角色有助于模式我们的社交互动并提供预测性并提供预测性”[10]。

著录项

来源
《International Workshop Mobile Social Signal Processing》|2014年||共12页
会议地点
作者
Fabio Valente; Alessandro Vinciarelli;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN911.7-53;
关键词

相似文献

外文文献
中文文献
专利

1. Estimating Dominance in Multi-Party Meetings Using Speaker Diarization [J] . Hung H.Huang Y.Friedland G.Gatica-Perez D. Audio, Speech, and Language Processing, IEEE Transactions on . 2011,第4期

机译：使用演讲者差异化估计多方会议的主导权
2. Overlapping Speech Detection Using Long-Term Conversational Features for Speaker Diarization in Meeting Room Conversations [J] . Yella S.H., Bourlard H. Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2014,第12期

机译：会议室会话中使用长期会话特征进行语音重叠的语音检测重叠
3. Initialization of Iterative-Based Speaker Diarization Systems for Telephone Conversations [J] . Ben-Harush O., Ben-Harush O., Lapidot I., Audio, Speech, and Language Processing, IEEE Transactions on . 2012,第2期

机译：电话会议基于迭代的说话人区分系统的初始化
4. Speaker Diarization of Multi-party Conversations Using Participants Role Information: Political Debates and Professional Meetings [C] . Fabio Valente, Alessandro Vinciarelli International Workshop Mobile Social Signal Processing . 2014

机译：使用参与者的多方对话的发言者日益改估职称：政治辩论和专业会议
5. Use of speaker location features in meeting diarization. [D] . Otterson, Scott. 2008

机译：会议发言者使用语音定位功能。
6. Sensorimotor activation related to speaker vs. listener role during natural conversation [O] . Anne Mandel, Mathieu Bourguignon, Lauri Parkkonen, -1

机译：自然对话过程中与说话人与听众角色有关的感觉运动激活
7. Speaker diarization of multi-party conversations using participants role information: political debates and professional meetings [O] . Valente, Fabio, Vinciarelli, Alessandro 2014

机译：演讲者使用参与者角色信息对多方对话进行分类：政治辩论和专业会议
8. Speaker-Specific Prosodic Models for Automatic Dialog Act Segmentation of Multi-Party Meetings [R] . Kolar, J. , Shriberg, E. , Liu, Y. 2006

机译：用于多方会议自动对话行为分割的特定于说话者的韵律模型

Speaker Diarization of Multi-party Conversations Using Participants Role Information: Political Debates and Professional Meetings

摘要

著录项

相似文献

相关主题

期刊订阅