The Effects of Acoustic Features of Speech for Automatic Speaker Recognition

机译：语音声学特征对说话人自动识别的影响

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Automatic speaker recognition is the task of automatically determining or verifying the identity of a speaker from a recording of his or her speech sample and has been studied for many decades. One of the most important steps of speaker recognition that significantly influences the speaker recognition performance is known as feature extraction. Acoustic features of speech have been researched by many researchers around the world, however, there is limited research conducted on African indigenous languages, South African official languages in particular. This paper presents the effects of acoustic features of speech towards the performance of speaker recognition systems focusing on South African low-resourced languages. This study investigates the acoustic features of speech using the National Centre for Human Language Technology (NCHLT) Sepedi speech data. Acoustic features of speech such as Time-domain, Frequency-domain and Cepstral-domain features are evaluated on four machine learning algorithms: K-Nearest Neighbours (K-NN), two kernel-based Support Vector Machines (SVM), and Multilayer Perceptrons (MLP). The results show that the performance is poor for time-domain features and good for spectral-domain features and even better for cepstral-domain features. However, the combination of these three features resulted in a higher accuracy and $F_{1}$ score of 98%.

机译：自动说话人识别是根据他或她的语音样本记录自动确定或验证说话人身份的任务，并且已经进行了数十年的研究。说话人识别的最重要步骤之一是会显着影响说话人识别性能，这被称为特征提取。语音的声学特征已被世界各地的许多研究人员研究，但是，对非洲土著语言，尤其是南非官方语言的研究很少。本文介绍了语音的声学特征对专注于南非低资源语言的说话人识别系统性能的影响。这项研究使用国家人类语言技术中心（NCHLT）的Sepedi语音数据调查语音的声学特征。语音的声学特征（例如时域，频域和倒谱域特征）在四种机器学习算法上进行了评估：K最近邻居（K-NN），两个基于内核的支持向量机（SVM）和多层感知器（MLP）。结果表明，对于时域特征，该性能较差;对于谱域特征，该性能较差;对于倒谱域特征，该性能甚至更好。但是，这三个功能的组合带来了更高的精度和 $ F_ {1} $ < / tex> 分数为98％。

著录项

来源
《International Conference on Artificial Intelligence, Big Data, Computing and Data Communication Systems》|2020年|1-5|共5页
会议地点
作者
Tumisho Billson Mokgonyane; Tshephisho Joseph Sefara; Madimetja Jonas Manamela; Thipe Isaiah Modipa; Moses Sebaka Masekwameng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Feature extraction; Acoustics; Speech recognition; Frequency-domain analysis; Time-domain analysis; Speaker recognition; Support vector machines;

机译：特征提取声学语音识别频域分析时域分析说话人识别支持向量机;

相似文献

外文文献
中文文献
专利

1. Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition [J] . Arata ITOH, Sunao HARA, Norihide KITAOKA, IEICE transactions on information and systems . 2012,第10期

机译：使用由MLLR转换生成的伪扬声器特征进行声学模型训练，以实现与扬声器无关的可靠语音识别
2. Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition [J] . Arata ITOH, Sunao HARA, Norihide KITAOKA, IEICE Transactions on Information and Systems . 2012,第10期

机译：使用由MLLR转换生成的伪扬声器特征进行声学模型训练，以实现与扬声器无关的可靠语音识别
3. Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization [J] . Umit H. Yapanel, John H.L. Hansen EURASIP journal on audio, speech, and music processing . 2008,第1期

机译：面向自动语音识别的智能声学前端：内置扬声器归一化
4. Automatic extraction of acoustic prototypes for large vocabulary speech recognition by using speaker-independent features [C] . Colla, A.M. . 1989

机译：通过使用与说话者无关的功能自动提取用于大词汇量语音识别的声学原型
5. Accent and speaker recognition for advanced automatic speech recognition. [D] . Angkititrakul, Pongtep. 2004

机译：口音和说话者识别功能可实现高级自动语音识别。
6. Automatic speech recognition using articulatory features from subject-independent acoustic-to-articulatory inversion [O] . Prasanta Kumar Ghosh, Shrikanth Narayanan -1

机译：使用从独立于受试者的声学到发音反转的发音特征进行自动语音识别
7. Acoustic Model Merging Using Acoustic Models from Multilingual Speakers for Automatic Speech Recognition [O] . Tien-ping Tan, Laurent Besacier, Benjamin Lecouteux 2015

机译：声学模型融合使用多语言扬声器的声学模型进行自动语音识别

The Effects of Acoustic Features of Speech for Automatic Speaker Recognition

摘要

著录项

相似文献

相关主题

期刊订阅