Development and Analysis of Speech Recognition Systems for Assamese Language Using HTK

HIMANGSHU SARMA; NAVANATH SAHARIA; UTPAL SHARMA

首页> 外文期刊>ACM transactions on Asian language information processing >Development and Analysis of Speech Recognition Systems for Assamese Language Using HTK

【24h】

Development and Analysis of Speech Recognition Systems for Assamese Language Using HTK

机译：使用HTK的阿萨姆语语音识别系统的开发和分析

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Language analysis is very important for the native speaker to connect with the digital world. Assamese is a relatively unexplored language. In this report, we analyze different aspects of speech-to-text processing, starting from building a speech corpus, defining syllable rules, and finally developing a speech search engine of Assamese. We have collected about 20 hours of speech in three (viz., read, extempore, and conversation) modes and transcribed it. We also discuss some issues and challenges faced during development of the corpus. We have developed an automatic syllabification model with 11 rules for the Assamese language and found an accuracy of more than 95% in our result. We found 12 different syllable patterns where 5 are found most frequent. The maximum length of a syllable found is four letters. With the help of Hidden Markov Model Toolkit (HTK) 3.5, we used deep learning based neural network for our speech recognition model, where we obtained 78.05% accuracy for automatic transcription of Assamese speech.

机译：语言分析对于母语为母语的人与数字世界建立联系非常重要。阿萨姆语是一种相对未开发的语言。在此报告中，我们分析了语音到文本处理的不同方面，从建立语音语料库，定义音节规则到最终开发阿萨姆语的语音搜索引擎。我们已经通过三种（即阅读，即席和对话）模式收集了大约20个小时的语音并进行了转录。我们还将讨论在语料库开发过程中面临的一些问题和挑战。我们为阿萨姆语开发了带有11条规则的自动音节化模型，发现结果的准确性超过95％。我们发现了12种不同的音节模式，其中5种最常见。找到的音节的最大长度为四个字母。在隐马尔可夫模型工具包（HTK）3.5的帮助下，我们将基于深度学习的神经网络用于我们的语音识别模型，在该模型中，阿萨姆语语音自动转录的准确度达到78.05％。

著录项

来源
《ACM transactions on Asian language information processing》 |2018年第1期|7.1-7.14|共14页
作者
HIMANGSHU SARMA; NAVANATH SAHARIA; UTPAL SHARMA;
展开▼
作者单位

Indian Institute of Information Technology Manipur;

Indian Institute of Information Technology Manipur;

Tezpur University;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Speech search engine; syllabification; automatic transcription; speech corpus; Assamese; HTK;

机译：语音搜索引擎;音节化;自动转录;语料库阿萨姆语;HTK;
入库时间 2022-08-18 04:03:41

相似文献

外文文献
中文文献
专利

1. Speech recognition with reference to Assamese language using novel fusion technique [J] . Sruti Sruba Bharali, Sanjib Kr. Kalita International journal of speech technology . 2018,第2期

机译：使用新型融合技术参考阿萨姆语进行语音识别
2. A Review on Marathi Language Speech Database Development for Automatic Speech Recognition (ASR) System [J] . Mrs. Chhaya S. Patil, Prof.Dr.Vaishali B.Patil International Journal of Engineering Research and Applications . 2017,第3期

机译：用于自动语音识别（ASR）系统的Marathi语言语音数据库开发的回顾
3. Development of Speech Corpus and Speech Recognition System for Indonesian Language [J] . Sakriani Sakti, Paulus Hutagaol, Arry Akhmad Arman, 電子情報通信学会技術研究報告. 音声. Speech . 2004,第542期

机译：印尼语语音语料库和语音识别系统的开发
4. An Automatic Speech Recognition for the Filipino Language using the HTK System [C] . John Lorenzo Bautista, Yoon-Joong Kim International conference on artificial intelligence . 2014

机译：使用HTK系统对菲律宾语言进行自动语音识别
5. Development of a speech recognition system using the Mel Frequency Cepstrum Coefficient method. [D] . Mahajan, Mayur. 2016

机译：使用梅尔频率倒谱系数方法开发语音识别系统。
6. Retrospective Analysis of Clinical Performance of an Estonian Speech Recognition System for Radiology: Effects of Different Acoustic and Language Models [O] . A. Paats, T. Alumäe, E. Meister, 2018

机译：一项爱沙尼亚放射线语音识别系统临床表现的回顾性分析：不同声学和语言模型的影响
7. Research and Development of Continuous Speech Recognition Based on HTK and Microsoft Speech SDK [O] . 黄旭 2007

机译：基于HTK和Microsoft Speech SDK的连续语音识别技术的研究与开发

Development and Analysis of Speech Recognition Systems for Assamese Language Using HTK

摘要

著录项

相似文献

相关主题

期刊订阅