首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Dialect Classification via Text-Independent Training and Testing for Arabic, Spanish, and Chinese
【24h】

Dialect Classification via Text-Independent Training and Testing for Arabic, Spanish, and Chinese

机译:通过独立于文本的培训和测试对阿拉伯语,西班牙语和汉语进行方言分类

获取原文
获取原文并翻译 | 示例
       

摘要

Automatic dialect classification has emerged as an important area in the speech research field. Effective dialect classification is useful in developing robust speech systems, such as speech recognition and speaker identification. In this paper, two novel algorithms are proposed to improve dialect classification for text-independent spontaneous speech in Arabic and Spanish languages, along with probe results for Chinese. The problem considers the case where no transcripts but dialect labels are available for training and test data, and speakers are speaking spontaneously, which is defined as text-independent dialect classification. The Gaussian mixture model (GMM) is used as the baseline system for text-independent dialect classification. The major motivation is to suppress confused/distractive regions from the dialect language space and emphasize discriminative/sensitive information of the available dialects. In the training phase, a symmetric version of the Kullback–Leibler divergence is used to find the most discriminative GMM mixtures (KLD-GMM), where the confused acoustic GMM region is suppressed. For testing, the more discriminative frames are detected and used via the location of where the frames are in the GMM mixture feature space, which is termed frame selection decoding (FSD-GMM). The first KLD-GMM and second FSD-GMM techniques, are shown to improve dialect classification performance for three-way dialect tasks. The two algorithms and their combination are evaluated on dialects of Arabic and Spanish corpora. Measurable improvement is achieved in both two cases, over a generalized maximum-likelihood estimation GMM baseline (MLE-GMM).
机译:自动方言分类已成为语音研究领域的重要领域。有效的方言分类对于开发健壮的语音系统(例如语音识别和说话者识别)很有用。本文提出了两种新颖的算法来改善阿拉伯文和西班牙文与文本无关的自发语音的方言分类以及中文的探测结果。该问题考虑了以下情况:没有笔录而是方言标签可用于训练和测试数据,并且说话者自发讲话,这被定义为与文本无关的方言分类。高斯混合模型(GMM)用作独立于文本的方言分类的基准系统。主要动机是抑制方言语言空间中的混淆/分散区域,并强调可用方言的区分/敏感信息。在训练阶段,使用Kullback-Leibler散度的对称形式来查找最有区别的GMM混合(KLD-GMM),在该混合中,混淆的声学GMM区域被抑制。为了进行测试,通过帧在GMM混合特征空间中的位置来检测和使用更具区分性的帧,这被称为帧选择解码(FSD-GMM)。显示了第一种KLD-GMM和第二种FSD-GMM技术可提高三向方言任务的方言分类性能。对阿拉伯语和西班牙语语料库的方言评估了这两种算法及其组合。在这两种情况下,都可以通过广义最大似然估计GMM基线(MLE-GMM)实现可衡量的改善。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号