首页> 外文会议>2010 International Conference on Intelligent Systems and Knowledge Engineering >A statistical speech recognition of Ningbo dialect monosyllables
【24h】

A statistical speech recognition of Ningbo dialect monosyllables

机译:宁波话单音节的统计语音识别

获取原文

摘要

So far, the focus of most research on speech recognition was on speech recognition of mandarin Chinese or English. Since the feature of the research is that the same word pronounces the same, influence on speech recognition of the research concerns primarily with environmental factors. Ningbo dialect is very different than mandarin Chinese and English, for Ningbo dialect possesses some regional variations in pronunciation and intonation even in the area of Ningbo, thus pronunciation changes, or intonation changes is a more important factor than other factors. Therefore, finding a modeling way to suit pronunciation changes, or intonation changes is a vital prerequisite for speech recognition of Ningbo dialect. This paper is designed to probe into the speech recognition of Ningbo dialect, focusing on Fenghua county, Cixi county, Yinzhou district, and central Ningbo. We study the modeling method of Ningbo dialect from the angle of pronunciation changes and intonation changes and running time of recognition. In the research, 64 speech samples of 10 digits (1–10) used in the above-mentioned four regions were created, by using Mel frequency cepstrum coefficient (MFCC) to achieve feature of each digital speech. Then depending on the variations of the pronunciation and intonation of the digits, we do a lot of experiments, 20 models of training samples of digits (1–10) are constructed. A simplified Bayes decision rule is used for classification of Ningbo dialect digits. Experiment data suggested that the rate of speech recognition surpassed 75%. The recognition rate is superior to that recognition rate (52.5%) of a general modeling method that modeling of training samples do not consider factor of regional variations in pronunciation and intonation. We have a rise in robustness of speech recognition of Ningbo dialect. The modeling and recognition method used in the paper is easy to handle and get promoted.
机译:到目前为止,大多数关于语音识别的研究的重点是普通话或英语的语音识别。由于研究的特征是相同的单词发音相同,因此对研究语音识别的影响主要与环境因素有关。宁波话与普通话和英语有很大不同,因为宁波话甚至在宁波地区,其发音和语调也有一些区域性的变化,因此语音的变化或语调的变化是比其他因素更为重要的因素。因此,寻找一种适合语音变化或语调变化的建模方法是宁波方言语音识别的重要前提。本文旨在探讨宁波话的语音识别功能,重点针对奉化县,慈溪县,Yin州区和宁波中部地区。从语音变化,语调变化和识别时间的角度研究宁波方言的建模方法。在研究中,通过使用梅尔频率倒谱系数(MFCC)来实现每个数字语音的特征,创建了在上述四个区域中使用的64个10位数(1-10)语音样本。然后,根据数字的发音和语调的变化,我们进行了很多实验,构建了20个数字训练样本(1-10)的模型。简化的贝叶斯决策规则用于宁波方言数字的分类。实验数据表明,语音识别率超过75%。识别率优于一般建模方法的识别率(52.5%),后者是对训练样本进行建模时不考虑发音和语调区域差异的因素。宁波话的语音识别功能越来越强大。本文所使用的建模和识别方法易于处理和推广。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号