首页> 外文会议>International Conference on Signal Processing and Communication Systems >Feature Extraction from Temporal Phase for Speaker Recognition

【24h】

Feature Extraction from Temporal Phase for Speaker Recognition

机译：从时间相中提取特征以进行说话人识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Feature extraction is important for pattern recognition problems in speech research. Most of the methods of feature extraction primarily exploit spectral information than the phase information. Even though phase is an important characteristics of the speech signal, its use is not much exploited. In this work, in addition to state-of-the-art Mel Frequency Cepstral Coefficients (MFCC), we use features derived from the temporal phase (i.e., T-Phase) of the speech signal for speaker recognition application. The proposed method extracts Linear Prediction Coefficients (LPC) from T-Phase of the speech signal at the frame-level. Experiments are carried on standard NIST 2002 Speaker Recognition Evaluation (SRE) using standard Gaussian Mixture Model - Universal Background Model (GMM-UBM) system. It is observed that the score-level fusion of MFCC and T-Phase feature sets gives 76.18 % identification rate which is a 4% and 8% improvement than MFCC and LPC alone, respectively. In addition, experiments show that score-level fusion reduces the % Equal Error Rate (EER) by 2% and 4% than MFCC and LPC alone, respectively.

机译：特征提取对于语音研究中的模式识别问题很重要。大多数特征提取方法主要利用频谱信息而不是相位信息。尽管相位是语音信号的重要特征，但对其的使用却很少。在这项工作中，除了最新的梅尔频率倒谱系数（MFCC）外，我们还将语音信号的时间相位（即T相）导出的特征用于说话人识别应用。所提出的方法从语音信号的帧相位的T相中提取线性预测系数（LPC）。使用标准的高斯混合模型-通用背景模型（GMM-UBM）系统在标准的NIST 2002说话者识别评估（SRE）上进行实验。可以看出，MFCC和T-Phase特征集的得分级融合给出了76.18％的识别率，分别比单独的MFCC和LPC提升了4％和8％。此外，实验表明，与单独的MFCC和LPC相比，评分级别的融合分别将％均等错误率（EER）降低了2％和4％。

著录项

来源
《International Conference on Signal Processing and Communication Systems 》|2018年|382-386|共5页
会议地点
作者
Ami Gandhi; Hemant A Patil;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Feature extraction; Speaker recognition; Indexes; Mel frequency cepstral coefficient; Standards; Speech recognition; Time-domain analysis;

机译：特征提取;说话人识别;指标;梅尔倒谱系数;标准;语音识别;时域分析;

相似文献

外文文献
中文文献
专利

1. Spectral-temporal receptive fields and MFCC balanced feature extraction for robust speaker recognition [J] . Wang Jia-Ching, Wang Chien-Yao, Chin Yu-Hao, Multimedia Tools and Applications . 2017 ,第3期

机译：频谱时域接收场和MFCC平衡特征提取，可增强说话人识别能力
2. Comparison of Speaker Adaptation Methods as Feature Extraction for SVM-Based Speaker Recognition [J] . Ferras M., Cheung-Chi Leung, Barras C., Audio, Speech, and Language Processing, IEEE Transactions on . 2010 ,第6期

机译：基于SVM的说话人识别中说话人自适应方法作为特征提取的比较
3. Vocal effort compensation for MFCC feature extraction in a shouted versus normal speaker recognition task [J] . Jokinen Emma, Saeidi Rahim, Kinnunen Tomi, Computer speech and language . 2019 ,第JANa期

机译：呼喊补偿与普通说话人识别任务中的MFCC特征提取有关
4. Feature Extraction from Temporal Phase for Speaker Recognition [C] . Ami Gandhi, Hemant A Patil International Conference on Signal Processing and Communications . 2018

机译：从扬声器识别的时间阶段提取特征提取
5. Physiologically-motivated feature extraction methods for speaker recognition. [D] . Wang, Jianglin. 2013

机译：用于说话人识别的生理动机特征提取方法。
6. Recognition of a Phase-Sensitivity OTDR Sensing System Based on Morphologic Feature Extraction [O] . Qian Sun, Hao Feng, Xueying Yan, 2015

机译：基于形态特征提取的相敏度OTDR传感系统识别
7. Comparison of Speaker Adaptation Methods as Feature Extraction for SVM-Based Speaker Recognition [O] . M Ferras, C Barras, J.-L. Gauvain 2010

机译：扬声器适应方法与基于SVM的扬声器识别特征提取的比较

Feature Extraction from Temporal Phase for Speaker Recognition

摘要

著录项

相似文献

相关主题

期刊订阅