首页> 外文期刊>ACM transactions on Asian language information processing >Development and Analysis of Speech Recognition Systems for Assamese Language Using HTK
【24h】

Development and Analysis of Speech Recognition Systems for Assamese Language Using HTK

机译:使用HTK的阿萨姆语语音识别系统的开发和分析

获取原文
获取原文并翻译 | 示例
       

摘要

Language analysis is very important for the native speaker to connect with the digital world. Assamese is a relatively unexplored language. In this report, we analyze different aspects of speech-to-text processing, starting from building a speech corpus, defining syllable rules, and finally developing a speech search engine of Assamese. We have collected about 20 hours of speech in three (viz., read, extempore, and conversation) modes and transcribed it. We also discuss some issues and challenges faced during development of the corpus. We have developed an automatic syllabification model with 11 rules for the Assamese language and found an accuracy of more than 95% in our result. We found 12 different syllable patterns where 5 are found most frequent. The maximum length of a syllable found is four letters. With the help of Hidden Markov Model Toolkit (HTK) 3.5, we used deep learning based neural network for our speech recognition model, where we obtained 78.05% accuracy for automatic transcription of Assamese speech.
机译:语言分析对于母语为母语的人与数字世界建立联系非常重要。阿萨姆语是一种相对未开发的语言。在此报告中,我们分析了语音到文本处理的不同方面,从建立语音语料库,定义音节规则到最终开发阿萨姆语的语音搜索引擎。我们已经通过三种(即阅读,即席和对话)模式收集了大约20个小时的语音并进行了转录。我们还将讨论在语料库开发过程中面临的一些问题和挑战。我们为阿萨姆语开发了带有11条规则的自动音节化模型,发现结果的准确性超过95%。我们发现了12种不同的音节模式,其中5种最常见。找到的音节的最大长度为四个字母。在隐马尔可夫模型工具包(HTK)3.5的帮助下,我们将基于深度学习的神经网络用于我们的语音识别模型,在该模型中,阿萨姆语语音自动转录的准确度达到78.05%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号