...
首页> 外文期刊>Procedia Computer Science >Significance of GMM-UBM based Modelling for Indian Language Identification
【24h】

Significance of GMM-UBM based Modelling for Indian Language Identification

机译:基于GMM-UBM的建模对印度语言识别的意义

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Most of the Indian languages are originated from Devanagari, the script of the Sanskrit language. In-spite of similarity in phoneme sets, every language its own influence on the phonotactic constraints of speech in that language. A modelling technique that is capable of capturing the slightest variations imparted by the language is a pre-requisite for developing a language identification system (LID). Use of Gaussian mixture modelling technique with a large number of mixture components demands a large training data for each language class, which is hard to collect and handle. In this work, phonotactic variations imparted by the different languages are modelled using Gaussian mixture modelling with a universal background model (GMM-UBM) technique. In GMM-UBM based modelling certain amount of data from all the language classes is pooled to develop a universal background model (UBM) and the model is adapted to each class. Spectral features (MFCC) are employed to represent the language specific phonotactic information of speech in different languages. During the present study, LID systems are developed using the speech samples from IITKGP-MLILSC. In this work, performance of the proposed GMM-UBM based LID system is compared with conventional GMM based LID system. An average improvement of 7–8% is observed due to the use of UBM-based modelling of developing a LID system.
机译:大多数印度语言都源于梵文的文字梵文。尽管音素集具有相似性,但每种语言都会对该语言的语音音位限制产生影响。能够捕获语言赋予的最小变化的建模技术是开发语言识别系统(LID)的先决条件。使用具有大量混合成分的高斯混合建模技术需要针对每种语言类的大量训练数据,这很难收集和处理。在这项工作中,使用具有通用背景模型(GMM-UBM)技术的高斯混合建模对由不同语言赋予的音韵变化进行建模。在基于GMM-UBM的建模中,来自所有语言类别的一定数量的数据被合并以开发通用背景模型(UBM),并且该模型适用于每个类别。频谱特征(MFCC)用于表示不同语言的特定于语言的语音语音信息。在本研究中,LID系统是使用IITKGP-MLILSC的语音样本开发的。在这项工作中,将所提出的基于GMM-UBM的LID系统的性能与常规基于GMM的LID系统进行了比较。由于使用了基于UBM的模型来开发LID系统,因此观察到平均改善了7–8%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号