Prosodic and other Long-Term Features for Speaker Diarization

Friedland G.; Vinyals O.; Yan Huang; Muller C.

首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Prosodic and other Long-Term Features for Speaker Diarization

【24h】

Prosodic and other Long-Term Features for Speaker Diarization

机译：韵律和其他长期特征，可实现说话人区分

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speaker diarization is defined as the task of determining ldquowho spoke whenrdquo given an audio track and no other prior knowledge of any kind. The following article shows how a state-of-the-art speaker diarization system can be improved by combining traditional short-term features (MFCCs) with prosodic and other long-term features. First, we present a framework to study the speaker discriminability of 70 different long-term features. Then, we show how the top-ranked long-term features can be combined with short-term features to increase the accuracy of speaker diarization. The results were measured on standardized datasets (NIST RT) and show a consistent improvement of about 30% relative in diarization error rate compared to the best system presented at the NIST evaluation in 2007.

机译：说话者区分被定义为确定在给定音轨且没有任何其他先验知识的情况下讲话的人的任务。下一篇文章显示了如何通过将传统的短期特征（MFCC）与韵律和其他长期特征相结合来改进最新的扬声器二分系统。首先，我们提出一个框架来研究70种不同长期特征的说话人辨别力。然后，我们展示了如何将排名靠前的长期特征与短期特征结合起来以提高说话者区分的准确性。结果是在标准化数据集（NIST RT）上进行测量的，与2007年NIST评估中提出的最佳系统相比，显示出相对误差的稳定提高了约30％。

著录项

来源
《Audio, Speech, and Language Processing, IEEE Transactions on》 |2009年第5期|p.985-993|共9页
作者
Friedland G.; Vinyals O.; Yan Huang; Muller C.;
展开▼
作者单位

Int. Comput. Sci. Inst., Berkeley, CA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
audio signal processing; cepstral analysis; MFCC; audio track; long-term features; mel-frequency cepstral coefficients; speaker diarization; speaker discriminability; prosody;

机译：音频信号处理;倒谱分析;MFCC;音轨;长期特征;梅尔频率倒谱系数;扬声器极化;扬声器判别力;韵律;
入库时间 2022-08-18 01:26:13

相似文献

外文文献
中文文献
专利

1. Overlapping Speech Detection Using Long-Term Conversational Features for Speaker Diarization in Meeting Room Conversations [J] . Yella S.H., Bourlard H. Audio, Speech, and Language Processing, IEEE/ACM Transactions on . 2014,第12期

机译：会议室会话中使用长期会话特征进行语音重叠的语音检测重叠
2. Prosodic features-based speaker verification using speaker-specific-text for short utterances [J] . Jianwu Zhang, Jianchao He, Zhendong Wu, International Journal of Embedded Systems . 2017,第3期

机译：基于韵律的扬声器验证，使用扬声器特定文本进行短语
3. Speaker overlap detection with prosodic features for speaker diarisation [J] . Zelenak M., Hernando J. Signal Processing, IET . 2012,第8期

机译：具有韵律特征的说话人重叠检测，可实现说话人区分
4. Prosodic and Phonetic Features for Speaker Clustering in Speaker Diarization Systems [C] . Janez Zibert, France Mihelic Annual conference of the International Speech Communication Association;INTERSPEECH 2011 . 2011

机译：说话人差异化系统中说话人聚类的韵律和语音特征
5. Use of speaker location features in meeting diarization. [D] . Otterson, Scott. 2008

机译：会议发言者使用语音定位功能。
6. Supervised Speaker Diarization Using Random Forests: A Tool for Psychotherapy Process Research [O] . Lukas Fürer, Nathalie Schenk, Volker Roth, 2020

机译：使用随机森林监督扬声器日期：一种心理治疗过程研究的工具
7. Prosodic and other Long-Term Features for Speaker Diarization [O] . Gerald Friedl, Oriol Vinyals, Yan Huang, 2011

机译：韵律和其他长期特征，可实现说话人区分
8. Robust Speech Processing & Recognition: Speaker ID, Language ID, Speech Recognition/Keyword Spotting, Diarization/Co-Channel/Environmental Characterization, Speaker State Assessment. [R] . Hansen, J. H. 2015

机译：强大的语音处理和识别：说话者ID，语言ID，语音识别/关键字识别，Diarization / Co-Channel /环境表征，说话者状态评估。

Prosodic and other Long-Term Features for Speaker Diarization

摘要

著录项

相似文献

相关主题

期刊订阅