【24h】

Bengali Spoken Numerals Recognition by MFCC and GMM Technique

机译:MFCC和GMM技术的孟加拉语口语标号识别

获取原文

摘要

Speech is the standard vocalized communication media. Speech is one of the comfortable way for humans to communicate with each other. Similarly, speech recognition system is eagerly necessary to communicate with computer through voice. Speech recognition in English language already helps us to operate English voice command-based applications. But in rural and semi-urban areas, due to lack of knowledge in English in India, it is necessary to implement automatic speech recognition in regional languages. Here, we have built a Gaussian Mixture Model (GMM)-based Bengali (also called Bangla) isolated spoken numerals recognition system where mel frequency cepstral coefficients denoted as MFCC is taken for feature extraction. The proposed system achieved 91.7% correct prediction for the Bangla numeral data set of 1000 audio samples for 10 classes which is satisfactory for previous Bangla spoken digit recognition.
机译:语音是标准的发声通信媒体。 言语是人类互相沟通的舒适方式之一。 类似地,语音识别系统急切地需要通过语音与计算机通信。 语音识别英语已经帮助我们操作基于英语语音命令的应用程序。 但在农村和半城区地区,由于印度英语知识缺乏知识,有必要在区域语言中实施自动演讲。 在这里,我们建立了一个高斯混合模型(GMM) - 基于Bengali(也称为Bangla)隔离的口头标数识别系统,其中拍摄为MFCC的MEL频率剖面系数用于特征提取。 所提出的系统为10个音频样本的Bangla数字数据集进行了91.7%,对于10个类,这对于先前的Bangla口语数字识别令人满意。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号