首页> 外文OA文献 >Automatic speaker recognition dynamic feature identification and classification using distributed discrete cosine transform based mel frequency cepstral coefficients and fuzzy vector quantization

【2h】

Automatic speaker recognition dynamic feature identification and classification using distributed discrete cosine transform based mel frequency cepstral coefficients and fuzzy vector quantization

机译：基于分布式离散余弦变换的梅尔频率倒谱系数和模糊矢量量化自动说话人识别动态特征识别与分类

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The Mel-Frequency Cepstral Coefficients (MFCC) feature extraction method is a leading approach for speech feature extraction and current research aims to identify performance enhancements. In this thesis, a novel approach for MFCC feature extraction and classification is presented and used for speaker recognition. In this research, a new MFCC feature extraction method based on distributed Discrete Cosine Transform (DCT-II) is presented. The proposed feature extraction method applies the DCT-II technique to compute the dynamic features used during speaker recognition. The new algorithm incorporates the DCT-II based MFCC feature extraction method and a Fuzzy Vector Quantization (FVQ) data clustering classifier. The proposed automatic speaker recognition algorithm utilises a recently introduced variation of MFCC known as Delta-Delta MFCC (DDMFCC) to identify the dynamic features that are used for speaker recognition. A series of experiments were performed utilising three different feature extraction methods: (1) conventional MFCC; (2) DDMFCC; and (3) DCT-II based DDMFCC. The experiments were then expanded to include four data clustering classifiers including: (1) K-means Vector Quantization; (2) Linde Buzo Gray Vector Quantization; (3) FVQ; and (4) Gaussian Mixture Model. The National Institute of Standards and Technology (NIST) Speaker Recognition Evaluation (SRE 04) corpora was used to provide speaker source data for the experiments. The combination of DCT-II based MFCC, DMFCC and DDMFCC with FVQ was found to have the lowest Equal Error Rate (EER) for the vector quantization based classifiers. The speaker verification tests highlighted the overall improvement in performance for the new ASR system.

机译：梅尔频率倒谱系数（MFCC）特征提取方法是语音特征提取的领先方法，当前的研究旨在识别性能增强。本文提出了一种新的MFCC特征提取和分类方法，并将其用于说话人识别。本研究提出了一种新的基于分布式离散余弦变换（DCT-II）的MFCC特征提取方法。提出的特征提取方法应用DCT-II技术来计算说话人识别期间使用的动态特征。新算法结合了基于DCT-II的MFCC特征提取方法和模糊矢量量化（FVQ）数据聚类分类器。提出的自动说话人识别算法利用了最近引入的MFCC变体，称为Delta-Delta MFCC（DDMFCC）来识别用于说话人识别的动态特征。利用三种不同的特征提取方法进行了一系列实验：（1）常规MFCC；（2）DDMFCC；（3）基于DCT-II的DDMFCC。然后将实验扩展到包括四个数据聚类分类器，包括：（1）K-均值向量量化；（2）Linde Buzo灰色向量量化；（3）FVQ；（4）高斯混合模型。美国国家标准技术研究院（NIST）说话者识别评估（SRE 04）语料库用于为实验提供说话者源数据。对于基于矢量量化的分类器，发现基于DCT-II的MFCC，DMFCC和DDMFCC与FVQ的组合具有最低的均等错误率（EER）。演讲者验证测试强调了新ASR系统在性能方面的总体改进。

著录项

作者
Hossan M;
展开▼
作者单位

展开▼
年度 2011
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Speaker recognition utilizing distributed DCT-II based Mel frequency cepstral coefficients and fuzzy vector quantization [J] . M. Afzal Hossan, Mark A. Gregory International journal of speech technology . 2013,第1期

机译：基于分布式DCT-II的Mel频率倒谱系数和模糊矢量量化的说话人识别
2. Speaker recognition utilizing distributed DCT-II based Mel frequency cepstral coefficients and fuzzy vector quantization [J] . M. Afzal Hossan, Mark A. Gregory International Journal of Speech Technology . 2013,第1期

机译：基于分布式DCT-II的Mel频率倒谱系数和模糊矢量量化的说话人识别
3. Automatic Genre Classification Using Fractional Fourier Transform Based Mel Frequency Cepstral Coefficient and Timbral Features [J] . Bhalke Daulappa Guranna, Rajesh Betsy, Bormane Dattatraya Shankar Archives of acoustics . 2017,第2期

机译：基于分数阶傅立叶变换的梅尔频率倒谱系数和音色特征的自动类型分类
4. Support vector machines, Mel-Frequency Cepstral Coefficients and the Discrete Cosine Transform applied on voice based biometric authentication [C] . Barbosa Felipe Gomes, Silva Washington Luis Santos SAI Intelligent Systems Conference . 2015

机译：支持向量机，梅尔频率倒谱系数和离散余弦变换应用于基于语音的生物特征认证
5. Block-level discrete cosine transform coefficients for autonomic face recognition. [D] . Scott, Willie L., II. 2003

机译：用于自主人脸识别的块级离散余弦变换系数。
6. Voice Disorder Classification Based on Multitaper Mel Frequency Cepstral Coefficients Features [O] . Ömer Eskidere, Ahmet Gürhanlı 2015

机译：基于多锥梅尔频率倒谱系数特征的语音障碍分类
7. Speaker recognition utilizing distributed DCT-II based Mel frequency cepstral coefficients and fuzzy vector quantization [O] . Hossan M, Gregory M 2013

机译：利用基于分布式DCT-II的mel频率倒谱系数和模糊矢量量化的说话人识别

Automatic speaker recognition dynamic feature identification and classification using distributed discrete cosine transform based mel frequency cepstral coefficients and fuzzy vector quantization

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅