Significance of GMM-UBM based Modelling for Indian Language Identification

V. Ravi Kumar; Hari Krishna Vydana; Anil Kumar Vuppala

首页> 外文期刊>Procedia Computer Science >Significance of GMM-UBM based Modelling for Indian Language Identification

【24h】

Significance of GMM-UBM based Modelling for Indian Language Identification

机译：基于GMM-UBM的建模对印度语言识别的意义

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Most of the Indian languages are originated from Devanagari, the script of the Sanskrit language. In-spite of similarity in phoneme sets, every language its own influence on the phonotactic constraints of speech in that language. A modelling technique that is capable of capturing the slightest variations imparted by the language is a pre-requisite for developing a language identification system (LID). Use of Gaussian mixture modelling technique with a large number of mixture components demands a large training data for each language class, which is hard to collect and handle. In this work, phonotactic variations imparted by the different languages are modelled using Gaussian mixture modelling with a universal background model (GMM-UBM) technique. In GMM-UBM based modelling certain amount of data from all the language classes is pooled to develop a universal background model (UBM) and the model is adapted to each class. Spectral features (MFCC) are employed to represent the language specific phonotactic information of speech in different languages. During the present study, LID systems are developed using the speech samples from IITKGP-MLILSC. In this work, performance of the proposed GMM-UBM based LID system is compared with conventional GMM based LID system. An average improvement of 7–8% is observed due to the use of UBM-based modelling of developing a LID system.

机译：大多数印度语言都源于梵文的文字梵文。尽管音素集具有相似性，但每种语言都会对该语言的语音音位限制产生影响。能够捕获语言赋予的最小变化的建模技术是开发语言识别系统（LID）的先决条件。使用具有大量混合成分的高斯混合建模技术需要针对每种语言类的大量训练数据，这很难收集和处理。在这项工作中，使用具有通用背景模型（GMM-UBM）技术的高斯混合建模对由不同语言赋予的音韵变化进行建模。在基于GMM-UBM的建模中，来自所有语言类别的一定数量的数据被合并以开发通用背景模型（UBM），并且该模型适用于每个类别。频谱特征（MFCC）用于表示不同语言的特定于语言的语音语音信息。在本研究中，LID系统是使用IITKGP-MLILSC的语音样本开发的。在这项工作中，将所提出的基于GMM-UBM的LID系统的性能与常规基于GMM的LID系统进行了比较。由于使用了基于UBM的模型来开发LID系统，因此观察到平均改善了7–8％。

著录项

来源
《Procedia Computer Science》 |2015年第1期|共6页
作者
V. Ravi Kumar; Hari Krishna Vydana; Anil Kumar Vuppala;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Modelling multi-level prosody and spectral features using deep neural network for an automatic tonal and non-tonal pre-classification-based Indian language identification system [J] . Bhanja Chuya China, Laskar Mohammad Azharuddin, Laskar Rabul Hussain Language Resources and Evaluation . 2021,第3期

机译：基于自动色调和非音调预分类的印度语言识别系统建模多级韵律和光谱特征
2. A Pre-classification-Based Language Identification for Northeast Indian Languages Using Prosody and Spectral Features [J] . Bhanja Chuya China, Laskar Mohammad Azharuddin, Laskar Rabul Hussain Circuits, systems, and signal processing . 2019,第5期

机译：基于韵律和谱特征的东北印度语言基于分类的语言识别
3. A GMM-BASED HIERARCHICAL AUTOMATIC LANGUAGE IDENTIFICATION SYSTEM FOR INDIAN LANGUAGES [J] . S. Jothilakshmi, V. Ramalingam, S. Palanivel Applied Artificial Intelligence . 2012,第5a7期

机译：基于GMM的印度语种分层自动语言识别系统。
4. Language Identification Using Gender Dependent GMM-UBM for Three Indian Languages [C] . C Anjanendu, Anu George, Leena Mary International Conference on Trends in Electronics and Informatics . 2018

机译：使用性别相关的GMM-UBM对三种印度语言进行语言识别
5. Logic, formal languages, and formal language identification. Some logical properties of the languages in the Chomsky hierarchy, and an interrogative model of formal language identification. [D] . Pylkko, Pauli Olavi. 1988

机译：逻辑，形式语言和形式语言标识。乔姆斯基层次结构中语言的某些逻辑属性，以及形式语言标识的疑问模型。
6. A Pattern-based Analysis of Clinical Computer-interpretable Guideline Modeling Languages [O] . Nataliya Mulyar, Wil M.P. van der Aalst, Mor Peleg 2007

机译：基于模式的临床计算机可解释指南建模语言分析
7. Significance of GMM-UBM based Modelling for Indian Language Identification [O] . Kumar V. Ravi, Vydana Hari Krishna, Vuppala Anil Kumar 2015

机译：基于GMM-UBM的建模对印度语言识别的意义
8. Examination of Modeling Languages to Allow Quantitative Analysis for Model-Based Systems Engineering [R] . Nutting, J W 2014

机译：检验建模语言以允许基于模型的系统工程的定量分析

Significance of GMM-UBM based Modelling for Indian Language Identification

摘要

著录项

相似文献

相关主题

期刊订阅