Multi-System Fusion of Extended Context Prosodic and Cepstral Features for Paralinguistic Speaker Trait Classification

机译：延长上下文博物馆的多系统融合，抗肌肌肌剖面，PalAl语言扬声器特质分类

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

As automatic speech processing has matured, research atten tion has expanded to paralinguistic speech problems that aim to detect beyond-the-words information. This paper focuses on the identification of seven speaker trait categories from the Interspeech Speaker Trait Challenge: likeability, intelligibility, openness, conscientiousness, extraversion, agreeableness, and neuroticism. Our approach combines multiple features includ ing prosodic, cepstral, shifted-delta cepstral, and a reduced set of the OpenSMILE features. Our classification approaches in cluded GMM-UBM, eigenchannel, support vector machines, and distance based classifiers. Optimized feature reduction and logistic regression-based score calibration and fusion led to re sults that perform competitively against the challenge baseline in all categories.

机译：随着自动语音处理已经成熟，研究效果扩展到Paralinguistic言语问题，该问题旨在检测单词信息。本文侧重于识别七位扬声器特质类别，从三个扬声器特征挑战挑战：可爱，可懂度，可懂度，开放性，尽职苛求性，外向，令人满意和神经质。我们的方法结合了多种特征，包括韵律，颅骨，移位 - δ颅脂素，以及一组减少的开放式功能。我们在Cluded GMM-UBM，Eigenchannel，支持向量机和基于距离的分类器中的分类方法。优化的特征减少和基于Logistic回归的分数校准和融合导致了竞争地对所有类别中的挑战基线进行竞争性的调整。

著录项

来源
《Annual conference of the International Speech Communication Association》|2012年||共4页
会议地点
作者
Michelle Hewlett Sanchez; Aaron Lawson; Dimitra Vergyri; Harry Bratt;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类通信;
关键词
speaker traits; prosody; MFCCs; Gaussian mixture modeling;

机译：扬声器特征;prosody;mfccs;高斯混合建模;

相似文献

外文文献
中文文献
专利

1. Fusion of auditory inspired amplitude modulation spectrum and cepstral features for whispered and normal speech speaker verification [J] . Milton Sarria-Paja, Tiago H. Falk Computer speech and language . 2017,第Sepa期

机译：融合听觉启发式调幅频谱和倒频谱特性，以进行耳语和正常语音说话者验证
2. Robust speaker identification via fusion of subglottal resonances and cepstral features [J] . Guo Jinxi, Yang Ruochen, Arsikere Harish, The Journal of the Acoustical Society of America . 2017,第4期

机译：通过融合脱墨型共振和抗痉挛特征强大的扬声器识别
3. Multiple views of the response of an ensemble of spectro-temporal features support concurrent classification of utterance, prosody, sex and speaker identity [J] . M. COATH, J. M. BRADER, S. FUSI, Network . 2005,第2a3期

机译：多种光谱时态响应的多重视图支持话语，韵律，性别和说话人身份的并发分类
4. Multi-System Fusion of Extended Context Prosodic and Cepstral Features for Paralinguistic Speaker Trait Classification [C] . Michelle Hewlett Sanchez, Aaron Lawson, Dimitra Vergyri, Annual conference of the International Speech Communication Association . 2012

机译：扩展上下文韵律和倒谱特征的多系统融合，用于副语言说话人特质分类
5. Prosody and speaker state: Paralinguistics, pragmatics, and proficiency. [D] . Liscombe, Jackson J. 2007

机译：韵律和说话者状态：副语言学，语用学和熟练程度。
6. Detecting paralinguistic events in audio stream using context in features and probabilistic decisions [O] . Rahul Gupta, Kartik Audhkhasi, Sungbok Lee, -1

机译：使用特征中的上下文和概率决策来检测音频流中的副语言事件
7. Evaluation of Lineal Relation between Shifted Delta Cepstral Features and Prosodic Features in Speaker Verification [O] . José R. Calvo, Dayana Ribas, Rafael Fernández, 2008

机译：扬声器验证中移位Delta临床特征与韵律特征的延时关系评价
8. Cepstral and Auditory Model Features for Speaker Recognition. [R] . Colombi, J. M. 1992

机译：用于说话人识别的倒谱和听觉模型特征。

Multi-System Fusion of Extended Context Prosodic and Cepstral Features for Paralinguistic Speaker Trait Classification

摘要

著录项

相似文献

相关主题

期刊订阅