Performance comparison of MFCC based bangla ASR system in presence and absence of third differential coefficients

机译：有无三阶微分系数的基于MFCC的孟加拉ASR系统的性能比较

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Present Mel Frequency Cepstral Coefficient (MFCC) based Bangla Automatic Speech Recognition (ASR) systems are mostly implemented with delta and acceleration coefficients. With delta and acceleration coefficients of MFCC and the log energy, a vector set of 39 dimensions is obtained per 10ms. In this paper, our objective is to observe the effect of third differential coefficients on the performance of Bangla ASR, which is not explored in this field yet. In doing so, we have appended 13 third differential coefficients along with previous 39 coefficients to make a vector set of 52 coefficients per 10ms frame. We have observed the performance of Bangla ASR system in the presence and absence of third differential coefficients using Hidden Markov Model (HMM) based tied-state triphone model. To make the speech corpus, 100 sentences have been uttered by a different number of speakers at different phases including both male and female of similar ages in between 22–24. Hidden-Markov-Model Toolkit (HTK) has been used here for the comparative analysis. We have considered the Sentence Correction Rate (SCR) as the performance indicator. From the experiments, it has been observed that the MFCC based system of 39 (MFCC39) and 52 (MFCC52) dimensions have average SCR of 98.89% and 98.94% respectively. Therefore, our finding is that slight improvement is possible with the inclusion of third differential coefficients when the sampling data rate is as high as 44.1 KHz.

机译：当前的基于梅尔频率倒谱系数（MFCC）的孟加拉语自动语音识别（ASR）系统主要采用增量系数和加速度系数来实现。利用MFCC的增量系数和加速度系数以及对数能量，每10ms可获得39个维的向量集。在本文中，我们的目的是观察三次微分系数对Bangla ASR性能的影响，这一领域尚未对此进行探讨。为此，我们将13个第三微分系数与之前的39个系数一起添加，以构成每10ms帧52个系数的向量集。我们已经观察到了使用基于隐马尔可夫模型（HMM）的束缚三音器模型的Bangla ASR系统在存在和不存在三阶微分系数的情况下的性能。为了使语音语料库，在不同阶段，包括22岁至24岁之间年龄相近的男性和女性，在不同阶段讲了100句话。隐马尔可夫模型工具包（HTK）在这里用于比较分析。我们已经将句子校正率（SCR）视为性能指标。从实验中，已经观察到，尺寸为39（MFCC39）和52（MFCC52）的基于MFCC的系统的平均SCR分别为98.89％和98.94％。因此，我们的发现是，当采样数据速率高达44.1 KHz时，如果包含第三差分系数，则可能会有轻微的改善。

著录项

来源
《International Conference on Electrical Engineering and Information Communication Technology》|2016年|1-6|共6页
会议地点
作者
Sudipto Debnath; Fatema-E-Jannat; Susmita Saha; Mohammad Tarik Aziz; Rifayet Hasan Sajol; Md. Jakaria Rahimi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Hidden Markov models; Mel frequency cepstral coefficient; Speech; Feature extraction; Filter banks; Acceleration; Mathematical model;

机译：隐马尔可夫模型;梅尔频率倒谱系数;语音;特征提取;滤波器组;加速度;数学模型;

相似文献

外文文献
中文文献
专利

1. Effects of Filter Numbers and Sampling Frequencies on the Performance of MFCC and PLP based Bangla Isolated Word Recognition System [J] . Oli Lowna Baroi, Shaikh Abrar Kabir, Azhar Niaz, International Journal of Image, Graphics and Signal Processing . 2019,第11期

机译：滤波器数量和采样频率对基于MFCC和PLP的孟加拉语孤立单词识别系统性能的影响
2. Performance Comparison of BB84 and B92 Satellite-Based Free Space Quantum Optical Communication Systems in the Presence of Channel Effects [J] . R.Etengu, F. M. Abbou, H.Y.Wong, Journal of Optical Communications . 2011,第1期

机译：信道效应下基于BB84和B92卫星的自由空间量子光通信系统的性能比较
3. Optimal Controller and Controller Based on Differential Flatness in a Linear Guide System: A Performance Comparison of Indexes [J] . Gomez Becerra Fabio Abel, Olivares Peregrino Victor Hugo, Blanco Ortega Andres, Mathematical Problems in Engineering . 2015,第pta25期

机译：线性导轨系统中最优控制器和基于微分平坦度的控制器：性能指标比较
4. Performance comparison of MFCC based bangla ASR system in presence and absence of third differential coefficients [C] . Sudipto Debnath, Fatema-E-Jannat, Susmita Saha, International Conference on Electrical Engineering and Information Communication Technology . 2016

机译：基于MFCC基于第三差分系数的MFCC Bangla ASR系统的性能比较
5. PART I: THE MEAN MOLAL ACTIVITY COEFFICIENT OF POLYMETHACRYLIC ACID AT VARIOUS DEGREES OF NEUTRALIZATION. PART II: POLAROGRAPHIC, POTENTIOMETRIC, SPECTRAL, AND OPTICAL ROTATORY DISPERSION STUDIES OF POLY-L-GLUTAMIC ACID IN THE PRESENCE AND ABSENCE OF COBALT(II) AND NICKEL(II). [D] . TORRENCE, GLENDA M. 1974

机译：第一部分：中和度不同时，聚甲基丙烯酸的平均分子活度系数。第二部分：在存在和不存在钴（II）和镍（II）的情况下，聚-L-谷氨酸的极谱，电位，光谱和光学旋转色散研究。
6. Control of gluconeogenesis in rat liver cells. Flux control coefficients of the enzymes in the gluconeogenic pathway in the absence and presence of glucagon. [O] . A K Groen, C W van Roermund, R C Vervoorn, 1986

机译：控制大鼠肝细胞糖异生。在不存在和存在胰高血糖素的情况下糖异生途径中酶的通量控制系数。
7. Effects of Filter Numbers and Sampling Frequencies on the Performance of MFCC and PLP based Bangla Isolated Word Recognition System [O] . Oli Lowna Baroi, Md. Shaikh Abrar Kabir, Azhar Niaz, 2019

机译：过滤器数量和采样频率对MFCC和PLP Bangla孤立字识别系统性能的影响

Performance comparison of MFCC based bangla ASR system in presence and absence of third differential coefficients

摘要

著录项

相似文献

相关主题

期刊订阅