首页> 外文期刊>Mathematical Problems in Engineering >Language Recognition Using Latent Dynamic Conditional Random Field Model with Phonological Features
【24h】

Language Recognition Using Latent Dynamic Conditional Random Field Model with Phonological Features

机译:具有语音特征的潜在动态条件随机场模型的语言识别

获取原文
获取原文并翻译 | 示例

摘要

Spoken language recognition (SLR) has been of increasing interest in multilingual speech recognition for identifying the languages of speech utterances. Most existing SLR approaches apply statistical modeling techniques with acoustic and phonotactic features. Among the popular approaches, the acoustic approach has become of greater interest than others because it does not require any prior language-specific knowledge. Previous research on the acoustic approach has shown less interest in applying linguistic knowledge; it was only used as supplementary features, while the current state-of-the-art system assumes independency among features. This paper proposes an SLR system based on the latent-dynamic conditional random field (LDCRF) model using phonological features (PFs). We use PFs to represent acoustic characteristics and linguistic knowledge. The LDCRF model was employed to capture the dynamics of the PFs sequences for language classification. Baseline systems were conducted to evaluate the features and methods including Gaussian mixture model (GMM) based systems using PFs, GMM using cepstral features, and the CRF model using PFs. Evaluated on the NIST LRE 2007 corpus, the proposed method showed an improvement over the baseline systems. Additionally, it showed comparable result with the acoustic system based on i-vector. This research demonstrates that utilizing PFs can enhance the performance.
机译:口语识别(SLR)在多语言语音识别中越来越受到关注,以识别语音表达的语言。大多数现有的SLR方法都采用具有声学和音趋性特征的统计建模技术。在流行的方法中,声学方法比其他方法引起了更大的兴趣,因为它不需要任何特定于语言的知识。先前对声学方法的研究显示出对应用语言知识的兴趣减少。它仅用作辅助功能,而当前的最新系统假定功能之间是独立的。本文提出了一种基于潜在动态条件随机场(LDCRF)模型的SLR系统,该模型使用语音特征(PF)。我们使用PF来表示声学特征和语言知识。 LDCRF模型用于捕获PF序列的动态变化以进行语言分类。进行了基线系统评估,以评估特征和方法,包括使用PF的基于高斯混合模型(GMM)的系统,使用倒谱特征的GMM和使用PF的CRF模型。对NIST LRE 2007语料库进行评估,提出的方法显示出比基准系统更好的方法。此外,它显示了与基于i-vector的声学系统相当的结果。这项研究表明利用PF可以提高性能。

著录项

  • 来源
    《Mathematical Problems in Engineering》 |2014年第3期|250160.1-250160.16|共16页
  • 作者单位

    Department of Computer Engineering, Chulalongkorn University, Bangkok 10330, Thailand;

    Department of Computer Engineering, Chulalongkorn University, Bangkok 10330, Thailand;

    Department of Computer Engineering, Chulalongkorn University, Bangkok 10330, Thailand;

    HLT, National Electronics and Computer Technology Center (NECTEC), Bangkok 10400, Thailand;

    HLT, National Electronics and Computer Technology Center (NECTEC), Bangkok 10400, Thailand;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号