【24h】

Development of vocal tract length normalized phonetic engine for Gujarati and Marathi languages

机译:古吉拉特语和马拉地语的声道长度归一化语音引擎的开发

获取原文
获取原文并翻译 | 示例

摘要

Phonetic engine (PE) is a system that converts speech sound units into symbols without any higher-level information (such as semantic or linguistic details). This paper presents the development of PE in two Indian languages, viz., Gujarati and Marathi. To investigate the performance of PE, speech recorded in three different modes, viz., read, spontaneous and lecture is considered. Database consists of a large number of speakers in each mode for these languages. In order to reduce the effects of speaker differences in the databases, Vocal Tract Length Normalization (VTLN) using Lee-Rose method is incorporated. Here, performances of PEs are tested using state-of-the-art Mel frequency cepstral coefficients (MFCC) and vocal tract length normalized features. Hidden Markov model (HMM)-based approach is used for modeling the phonetic units. On an average, improvement of 3.12 % and 1.32 % is achieved using vocal tract length normalized PE over MFCCs for Gujarati and Marathi, respectively.
机译:语音引擎(PE)是一种将语音声音单元转换为没有任何高级信息(例如语义或语言细节)的符号的系统。本文以古吉拉特语和马拉地语两种印度语言介绍了体育的发展。为了研究体育课的表现,考虑了以三种不同模式记录的语音,即阅读,自发和演讲。在每种模式下,数据库都包含大量使用这些语言的发言人。为了减少说话人差异在数据库中的影响,合并了使用Lee-Rose方法的声道长度标准化(VTLN)。在这里,使用最新的梅尔频率倒谱系数(MFCC)和声道长度归一化功能来测试PE的性能。基于隐马尔可夫模型(HMM)的方法用于对语音单位进行建模。平均而言,与古吉拉特语和马拉地语的MFCC相比,使用声道长度归一化PE分别可提高3.12%和1.32%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号