Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation

Wai Nang Chan; Nengheng Zheng; Tan Lee

首页> 外文期刊>IEEE transactions on audio, speech and language processing >Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation

【24h】

Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation

机译：语音源和语音相关特征对说话人分割的辨别力

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This paper presents an analysis of the speaker discrimination power of vocal source related features, in comparison to the conventional vocal tract related features. The vocal source features, named wavelet octave coefficients of residues (WOCOR), are extracted by pitch-synchronous wavelet transform of the linear predictive (LP) residual signals. Using a series of controlled experiments, it is shown that WOCOR is less sensitive to spoken content than the conventional MFCC features and thus more discriminative when the amount of training data is limited. These advantages of WOCOR are exploited in the task of speaker segmentation for telephone conversation, in which statistical speaker models need to be built upon short speech segments. Experimental results show that the proposed use of WOCOR leads to noticeable reduction of segmentation errors.

机译：与传统的声道相关特征相比，本文对声源相关特征的说话人辨别力进行了分析。通过线性预测（LP）残差信号的音高同步小波变换提取语音源特征，称为残差的小波八度音阶系数（WOCOR）。使用一系列受控实验，表明WOCOR对语音内容的敏感性不如常规MFCC功能，因此，在训练数据量有限的情况下，更具判别力。 WOCOR的这些优点在电话交谈的说话人分割任务中得到了利用，其中统计的说话人模型需要建立在简短的语音片段上。实验结果表明，WOCOR的建议使用可显着减少分割错误。

著录项

来源
《IEEE transactions on audio, speech and language processing》 |2007年第6期|p.1884-1892|共9页
作者
Wai Nang Chan; Nengheng Zheng; Tan Lee;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
speech processing; statistical analysis; linear predictive residual signals; pitch-synchronous wavelet transform; segmentation errors reduction; speaker segmentation; statistical speaker; telephone conversation; training data; vocal source power discrimination; vo;

机译：语音处理;统计分析;线性预测残差信号;基音同步小波变换;分段误差减少;扬声器分割;统计扬声器;电话交谈;训练数据;声源功率判别;vo;

相似文献

外文文献
中文文献
专利

1. Robust Speaker Recognition Using Denoised Vocal Source and Vocal Tract Features [J] . Wang N.Ching P. C.Zheng N.Lee T. Audio, Speech, and Language Processing, IEEE Transactions on . 2011,第1期

机译：使用降噪后的人声源和人声道功能进行可靠的说话人识别
2. Discrimination of speaker sex and size when glottal-pulse rate and vocal-tract length are controlled [J] . Smith DRR, Walters TC, Patterson RD The Journal of the Acoustical Society of America . 2007,第6期

机译：当控制声门脉搏率和声道长度时，区分说话者的性别和大小
3. Effective use of combined excitation source and vocal-tract information for speaker recognition tasks [J] . Krishna Dutta, Jagabandhu Mishra, Debadatta Pati International journal of speech technology . 2018,第4期

机译：有效地结合使用激励源和声道信息来进行说话人识别任务
4. Speaker Identification by Combining Various Vocal Tract and Vocal Source Features [C] . Yuta Kawakami, Longbiao Wang, Atsuhiko Kai, International conference on text, speech and dialogue . 2014

机译：结合各种人声道和人声源特征进行说话人识别
5. Speaker recognition using complementary information from vocal source and vocal tract. [D] . Zheng, Nengheng. 2006

机译：说话人识别使用来自声源和声道的补充信息。
6. Discrimination of speaker sex and size when glottal-pulse rate and vocal-tract length are controlled [O] . David R. R. Smith, Thomas C. Walters, Roy D. Patterson -1

机译：当控制声门脉搏率和声道长度时区分说话者的性别和大小
7. Robust speaker recognition using both vocal source and vocal tract features estimated from noisy input utterances. [O] . 2007

机译：Robust speaker recognition using both vocal source and vocal tract features estimated from noisy input utterances.

Discrimination Power of Vocal Source and Vocal Tract Related Features for Speaker Segmentation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅