Audio Feature Extraction on SIBI Dataset for Speech Recognition

机译：Sibi Dataset对语音识别的音频功能提取

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Mel Frequency Cepstral Coefficients has been regarded as the standard method of feature extraction for Automatic Speech Recognition (ASR) systems for the last few years. Its performance may be affected by multiple variables, such as the number of features, audio channels, filter width, or the types of filter banks used. In this paper, several comparisons were made to find the best combination of variables that provides the best results on the SIBI (Indonesian Sign Language) dataset, which consists of utterances of sentences by both Deaf and Hard of Hearing (DHH) and non-DHH people. Based on this experiment, although generally the ASR on DHH dataset is lower than those of the non-DHH dataset, the results are still relatively high, around 4.71 % WER and 10.30^{% SER compared to 0.15% and 0.40% in WER and SER, respectively.}

机译：MEL频率谱系数被认为是过去几年自动语音识别（ASR）系统的特征提取的标准方法。其性能可能受到多个变量的影响，例如特征数，音频通道，滤波器宽度或所使用的滤波器组的类型。在本文中，进行了几种比较，以找到最佳变量组合，这些变量在SIBI（印度尼西亚语手语）数据集中提供了最佳结果，这包括聋人和听力（DHH）和非DHH的句子的话语人们。基于该实验，虽然一般而言，DHH数据集上的ASR低于非DHH数据集的ASR，但结果仍然比较高，左右4.71％，而10.30^{％ SER分别为WER和SER的0.15％和0.40％。}

著录项

来源
《International Conference on Informatics, Multimedia, Cyber and Information System》|2020年|70-74|共5页
会议地点
作者
Ruhush Shoalihin; Erdefi Rakun;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Visualization; Multimedia systems; Lips; Filter banks; Feature extraction; Mel frequency cepstral coefficient; Standards;

机译：可视化;多媒体系统;嘴唇;过滤器银行;特征提取;麦倍频跳跃系数;标准;

相似文献

外文文献
中文文献
专利

1. A Low-Complexity Parabolic Lip Contour Model With Speaker Normalization for High-Level Feature Extraction in Noise-Robust Audiovisual Speech Recognition [J] . Borgstrom B.J., Alwan A. IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans . 2008,第6期

机译：具有说话人归一化功能的低复杂度抛物线形嘴唇轮廓模型，用于噪声鲁棒的视听语音识别中的高级特征提取
2. Recognition of isolated words using Zernike and MFCC features for audio visual speech recognition [J] . Prashant Borde, Amarsinh Varpe, Ramesh Manza, International journal of speech technology . 2015,第2期

机译：使用Zernike和MFCC功能识别视听语音的孤立单词
3. Speech Features Extraction Techniques for Robust Emotional Speech Analysis/Recognition [J] . K. M. Shiva Prasad, G. N. Kodanda Ramaiah, M. B. Manjunatha Indian Journal of Science and Technology . 2017,第3期

机译：语音特征提取技术，用于健壮的情感语音分析/识别
4. DCT-based Visual Feature Extraction for Indonesian Audiovisual Speech Recognition [C] . Hilman Fauzi Rijal, Suyanto Suyanto International Conference on Data Science and Its Applications . 2020

机译：基于DCT的印尼视听语音识别视觉特征提取
5. Two modified methods of feature extraction for automatic speech recognition. [D] . Ge, Wangning. 2013

机译：自动语音识别的特征提取的两种改进方法。
6. On the Speech Properties and Feature Extraction Methods in Speech Emotion Recognition [O] . Juraj Kacur, Boris Puterka, Jarmila Pavlovicova, 2021

机译：语音情感识别中的语音特性和特征提取方法
7. Comparison Between Different Feature Extraction Techniques for Audio-Visual Speech Recognition [O] . Alin G. Chit¸u, Leon J. M, Rothkrantz Jacek, 2012

机译：视听语音识别中不同特征提取技术的比较

Audio Feature Extraction on SIBI Dataset for Speech Recognition

摘要

著录项

相似文献

相关主题

期刊订阅