首页> 外文学位 >Advancements in robust algorithm formulation for dialect and speaker recognition.

【24h】

Advancements in robust algorithm formulation for dialect and speaker recognition.

机译：用于方言和说话人识别的鲁棒算法公式的进步。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The speech signal is comprised of many levels of information in addition to the text content itself, such as speaker information (e.g., dialect/accent, gender, emotion, age, identity) and environment information (e.g., channel, background noise, room conditions). This thesis focuses on the identification of two important factors in the speech signal, which include automatic dialect classification and automatic speaker recognition.;This thesis proposes two novel algorithms to improve dialect classification for text-independent spontaneous speech in both Arabic and Spanish languages, along with probe results for Chinese. The algorithms are formulated using the Kullback-Leibler divergence based mixture selection in the training phase and frame selection decoding in the testing phase under a Gaussian mixture model based framework. The major motivation of both algorithms is to suppress confused/distractive regions from the dialect language space and emphasize discriminative/sensitive information from the available dialects. In addition, since the difference among the dialects is very subtle, the performance is more sensitive to mismatches from other components in the speech signal. To compensate for mismatch and focus on the intrinsic dialect properties itself, the well-know factor analysis based mismatch compensation approach is used and extended to compensate for the various distortions (e.g., gender, speaker, and channel) in dialect identification so that only dialect information is emphasized, thereby improving overall performance.;The second thesis goal addresses the problem of the speaker recognition, where factor analysis, as one of the most important techniques, is widely used in model training and channel compensation. The correlation between speaker and distortion (e.g., channel and additive noise) is analyzed and modeled. A resulting simplified version of the model is then used to fit the factor analysis approach under a joint factor analysis framework, since factor analysis has been proven to be very effective for performance improvement. Next, in order to avoid the approximation of simplification in the joint factor analysis framework, the total variability model is studied and a new supervised approach is proposed to reserve more speaker specific information than the total variability model which is an unsupervised probabilistic principle components analysis approach. In addition, the combination of the proposed supervised and traditional unsupervised approaches is proposed and evaluated. Evaluations are performed on the NIST SRE-2008.;This thesis has therefore contributed to improved modeling and classification strategies for dialect/accent, as well as speaker recognition, based on leveraging discriminative knowledge which is learned during modeling. Such advancements will ultimately contribute to improve speech processing and language technology solutions.

机译：语音信号除了文本内容本身之外还包含许多级别的信息，例如说话者信息（例如方言/重音，性别，情感，年龄，身份）和环境信息（例如通道，背景噪音，房间状况））。本文着眼于语音信号中两个重要因素的识别，包括自动方言分类和说话人自动识别。本文提出了两种新颖的算法来改善阿拉伯语和西班牙语独立于文本的自发语音的方言分类，以及附带中文的探测结果。在基于高斯混合模型的框架下，在训练阶段使用基于Kullback-Leibler散度的混合选择以及在测试阶段使用帧选择解码来制定算法。两种算法的主要动机是抑制方言语言空间中的混淆/混乱区域，并强调来自可用方言的区分/敏感信息。此外，由于方言之间的差异非常细微，因此性能对于语音信号中其他成分的不匹配更加敏感。为了补偿不匹配并专注于固有的方言属性本身，使用了基于众所周知的因素分析的不匹配补偿方法，并扩展了该方法以补偿方言识别中的各种失真（例如性别，说话者和声道），从而仅使用方言强调信息，从而提高整体性能。第二个目标是解决说话人识别问题，其中因素分析作为最重要的技术之一，已广泛用于模型训练和信道补偿中。分析和建模扬声器与失真之间的相关性（例如，声道和附加噪声）。然后，将模型的简化版本用于联合因子分析框架下的因子分析方法，因为事实证明因子分析对于提高性能非常有效。接下来，为了避免在联合因子分析框架中近似化，研究了总变异性模型，并提出了一种新的监督方法，该方法比总变异性模型保留更多的说话人特定信息，后者是一种无监督的概率主成分分析方法。。另外，提出并评估了建议的有监督方法和传统无监督方法的组合。对NIST SRE-2008进行了评估。因此，本论文基于建模过程中获得的判别性知识，为改进方言/重音的建模和分类策略以及说话人识别做出了贡献。这些进步最终将有助于改善语音处理和语言技术解决方案。

著录项

作者
Lei, Yun.;
展开▼
作者单位

The University of Texas at Dallas.;

展开▼
授予单位 The University of Texas at Dallas.;
学科 Engineering Electronics and Electrical.
学位 Ph.D.
年度 2011
页码 164 p.
总页数 164
原文格式 PDF
正文语种 eng
中图分类康复医学;
关键词
入库时间 2022-08-17 11:44:31

相似文献

外文文献
中文文献
专利

1. LFBNN: Robust and Hybrid Training Algorithm to Neural Network for Hybrid Features-Enabled Speaker Recognition System [J] . Vasamsetti Srinivas, Santhi rani Journal of Engineering Research . 2020,第2期

机译：LFBNN：用于混合功能的扬声器识别系统的神经网络鲁棒和混合训练算法
2. An Adaptively Enhanced Auditory Transform Based Feature Extraction Algorithm for Robust Speaker Identification [J] . S.D. Umarani, R.S.D. Wahidabanu, P. Raviram International journal of soft computing . 2013,第1期

机译：基于自适应增强听觉变换的特征提取算法
3. ROBUST FEATURES FOR NOISY TEXT-INDEPENDENT SPEAKER IDENTIFICATION USING GFCC ALGORITHM COMBINED TO VAD AND CMN TECHNIQUES [J] . E. B. TAZI, A. BENABBOU, M. HARTI Journal of Theoretical and Applied Information Technology . 2012,第2期

机译：使用GFCC算法与VAD和CMN技术相结合的用于嘈杂的文本独立说话人的鲁棒功能
4. Robust Speaker Identification Algorithms and Results in Noisy Environments [C] . Bulent Ayhan, Chiman Kwan International symposium on neural networks . 2018

机译：嘈杂环境中的鲁棒说话人识别算法和结果
5. Advancements in robust algorithm formulation for speaker identification of whispered speech. [D] . Fan, Xing. 2012

机译：说话人识别低声语音的鲁棒算法公式化的进展。
6. Diadochokinetic rate in Saudi and Bahraini Arabic speakers: Dialect and the influence of syllable type [O] . Majid I. Alshahwan, Patricia E. Cowell, Sandra P. Whiteside 2020

机译：沙特阿拉伯和巴林阿拉伯语的重语动词率：方言和音节类型的影响
7. Robust Speaker Recognition Algorithm [O] . Wenchao Hao, Yi Chen, Lei Wang, 2017

机译：强大的扬声器识别算法
8. Noise Robust I-Vector Extractor Using Vector Taylor Series For Speaker Recognition. [R] . Lei, Y., Burget, L., Scheffer, N. 2013

机译：使用矢量泰勒级数进行说话人识别的噪声鲁棒I-向量提取器。

Advancements in robust algorithm formulation for dialect and speaker recognition.

摘要

著录项

相似文献

相关主题

期刊订阅