New transformed features generated by deep bottleneck extractor and a GMM-UBM classifier for speaker age and gender classification

Abu Mallouh Arafat; Qawaqneh Zakariya; Barkana Buket D.

首页> 外文期刊>Neural computing & applications >New transformed features generated by deep bottleneck extractor and a GMM-UBM classifier for speaker age and gender classification

【24h】

New transformed features generated by deep bottleneck extractor and a GMM-UBM classifier for speaker age and gender classification

机译：深瓶颈提取器生成的新型转换功能和发言者年龄和性别分类的GMM-UBM分类器

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Speaker age and gender classification is one of the most challenging problems in speech signal processing. Recently with developing technologies, identifying speaker age and gender information has become a necessity for speaker verification and identification systems such as identifying suspects in criminal cases, improving human-machine interaction, and adapting music for awaiting people queue. Despite the intensive studies that have been conducted to extract descriptive and distinctive features, the classification accuracies are still not satisfactory. In this work, a model for generating bottleneck features from a deep neural network and a Gaussian Mixture Model-Universal Background Model (GMM-UBM) classifier are proposed for speaker age and gender classification problem. Deep neural network with a bottleneck layer is trained in an unsupervised manner for calculating the initial weights between layers. Then, it is trained and tuned in a supervised manner to generate transformed mel-frequency cepstral coefficients (T-MFCCs). The GMM-UBM is used to build a GMM model for each class, and the models are used to classify speaker age and gender. Age-annotated database of German telephone speech (aGender) is used to evaluate the proposed classification system. The newly generated T-MFCCs have shown potential to achieve significant classification improvements in speaker age and gender classification by using the GMM-UBM classifier. The proposed classification system achieved an overall accuracy of 57.63%. The highest accuracy is calculated as 72.97% for adult female speakers.

机译：演讲者年龄和性别分类是语音信号处理中最具挑战性的问题之一。最近，通过开发技术，识别扬声器年龄和性别信息已成为发言验证和识别系统，例如识别刑事案件中的嫌疑人，改善人工机器互动，以及适应等待人们队列的音乐。尽管已经进行了密集的研究，但已经进行了提取描述性和独特的特征，但分类准确性仍然不令人满意。在这项工作中，提出了一种用于从深神经网络和高斯混合模型 - 通用背景模型（GMM-UBM）分类器的瓶颈特征的模型，用于发言者年龄和性别分类问题。具有瓶颈层的深神经网络以无监督的方式培训，用于计算层之间的初始重量。然后，以监督方式训练并调整，以产生变换的熔体频率谱系数（T-MFCC）。 GMM-UBM用于为每个类构建GMM模型，模型用于对扬声器年龄和性别进行分类。德国电话语音（Agender）的年龄注释数据库用于评估所提出的分类系统。通过使用GMM-UBM分类器，新生成的T-MFCCS显示出达到扬声器年龄和性别分类的显着分类改进。拟议的分类系统实现了57.63％的整体准确性。最高准确性计算成年女性扬声器的72.97％。

著录项

来源
《Neural computing & applications》 |2018年第8期|共13页
作者
Abu Mallouh Arafat; Qawaqneh Zakariya; Barkana Buket D.;
展开▼
作者单位

Univ Bridgeport Comp Sci &

Engn Dept Bridgeport CT 06604 USA;

Univ Bridgeport Comp Sci &

Engn Dept Bridgeport CT 06604 USA;

Univ Bridgeport Dept Elect Engn Bridgeport CT 06604 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类人工神经网络计算机;人工智能理论;
关键词
Speaker recognition; Age and gender; Classification; MFCCs; Deep neural network; DBF extractor;

机译：扬声器识别;年龄和性别;分类;MFCCS;深神经网络;DBF提取器;

相似文献

外文文献
中文文献
专利

1. New transformed features generated by deep bottleneck extractor and a GMM-UBM classifier for speaker age and gender classification [J] . Abu Mallouh Arafat, Qawaqneh Zakariya, Barkana Buket D. Neural computing & applications . 2018,第8期

机译：深瓶颈提取器生成的新型转换功能和发言者年龄和性别分类的GMM-UBM分类器
2. Gammachirp Filter Banks Applied in Roust Speaker Recognition Based on GMM-UBM Classifier [J] . Deng Lei, Gao Yong The international arab journal of information technology . 2020,第2期

机译：基于GMM-UBM分类器的ROUST扬声器识别伽马基杂交滤波器银行
3. Medical Image Classification Based on Deep Features Extracted by Deep Model and Statistic Feature Fusion with Multilayer Perceptron‬ [J] . ZhiFei Lai, HuiFang Deng Computational intelligence and neuroscience . 2018,第3期

机译：基于深度模型提取的深度特征和多层感知器统计特征融合的医学图像分类
4. Bottleneck Features from SNR-Adaptive Denoising Deep Classifier for Speaker Identification [C] . Zhili TAN, Man-Wai MAK Asia-Pacific Signal and Information Processing Association Annual Summit and Conference . 2015

机译：SNR自适应去噪深度分级器的瓶颈特点
5. A Framework for Enhancing Speaker Age and Gender Classification by Using a New Feature Set and Deep Neural Network Architectures [D] . Abumallouh, Arafat. 2017

机译：通过使用新功能集和深度神经网络体系结构提高演讲者年龄和性别分类的框架
6. New transformed features generated by deep bottleneck extractor and a GMM–UBM classifier for speaker age and gender classification [O] . Arafat Abu Mallouh, Zakariya Qawaqneh, Buket D. Barkana -1

机译：由深瓶颈提取器和GMM–UBM分类器生成的新转换功能用于说话人年龄和性别分类
7. New transformed features generated by deep bottleneck extractor and a GMM–UBM classifier for speaker age and gender classification [O] . Arafat Abu Mallouh, Zakariya Qawaqneh, Buket D. Barkana 2017

机译：由深瓶颈提取器和GMM–UBM分类器生成的新转换功能，用于说话人年龄和性别分类
8. Classification of JERS-1 Image Mosaic of Central Africa Using A Supervised Multiscale Classifier of Texture Features [R] . Saatchi, Sassan, DeGrandi, Franco, Simard, Marc, 1999

机译：利用有监督的多尺度纹理特征分类器对中非JERs-1图像拼接进行分类

New transformed features generated by deep bottleneck extractor and a GMM-UBM classifier for speaker age and gender classification

摘要

著录项

相似文献

相关主题

期刊订阅