首页> 外文期刊>ACM transactions on Asian language information processing >Acoustic Features for Hidden Conditional Random Fields-Based Thai Tone Classification
【24h】

Acoustic Features for Hidden Conditional Random Fields-Based Thai Tone Classification

机译:基于隐藏条件随机字段的泰语音色分类的声学特征

获取原文
获取原文并翻译 | 示例

摘要

In the Thai language, tone information is necessary for Thai speech recognition systems. Previous studies show that many acoustic cues are attributed to shapes of tones. Nevertheless, most Thai tone classification studies mainly adopted F_0 values and their derivatives without considering other acoustic features. In this article, other acoustic features for Thai tone classification are investigated. In the experiment, energy values and spectral information represented by three spectral-based features including the LPC-based feature, PLP-based feature, and MFCC-based feature are applied to the HCRF-based Thai tone classification, which was reported as the best approach for Thai tone classification. The energy values provide an error rate reduction of 22.40% in the isolated word scenario, while there are slight improvements in the continuous speech scenario. On the contrary, spectral-based features greatly contribute to Thai tone classification in the continuous-speech scenario, whereas spectral-based features slightly degrade performances in the isolated-word scenario. The best achievement in the continuous-speech scenario is obtained from the PLP-based feature, which yields an error rate reduction of 13.90%. Therefore, findings in this article are that energy values and spectral-based features, especially the PLP-based feature, are the main contributors to the improvement of the performances of Thai tone classification in the isolated-word scenario and the continuous-speech scenario, respectively.
机译:在泰语中,语音信息对于泰语语音识别系统是必需的。先前的研究表明,许多声音提示都归因于音调的形状。但是,大多数泰语音调分类研究主要采用F_0值及其派生词,而不考虑其他声学特征。在本文中,对泰语音色分类的其他声学特征进行了研究。在实验中,由基于光谱的三个特征(包括基于LPC的特征,基于PLP的特征和基于MFCC的特征)表示的能量值和光谱信息被应用于基于HCRF的泰语音色分类,这被认为是最好的泰语分类的方法。在孤立的单词方案中,能量值可将错误率降低22.40%,而在连续语音方案中,则略有改善。相反,在连续语音场景中,基于频谱的功能大大有助于泰语的分类,而在孤立词场景中,基于频谱的功能则稍微降低了性能。从基于PLP的功能中可以获得连续语音场景中的最佳成就,其错误率降低了13.90%。因此,本文的发现是能量值和基于频谱的功能(尤其是基于PLP的功能)是改善孤立词场景和连续语音场景中泰语音色分类性能的主要因素,分别。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号