首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Time–Frequency Cepstral Features and Heteroscedastic Linear Discriminant Analysis for Language Recognition
【24h】

Time–Frequency Cepstral Features and Heteroscedastic Linear Discriminant Analysis for Language Recognition

机译:时频倒谱特征和异方差线性判别分析

获取原文
获取原文并翻译 | 示例

摘要

The shifted delta cepstrum (SDC) is a widely used feature extraction for language recognition (LRE). With a high context width due to incorporation of multiple frames, SDC outperforms traditional delta and acceleration feature vectors. However, it also introduces correlation into the concatenated feature vector, which increases redundancy and may degrade the performance of backend classifiers. In this paper, we first propose a time–frequency cepstral (TFC) feature vector, which is obtained by performing a temporal discrete cosine transform (DCT) on the cepstrum matrix and selecting the transformed elements in a zigzag scan order. Beyond this, we increase discriminability through a heteroscedastic linear discriminant analysis (HLDA) on the full cepstrum matrix. By utilizing block diagonal matrix constraints, the large HLDA problem is then reduced to several smaller HLDA problems, creating a block diagonal HLDA (BDHLDA) algorithm which has much lower computational complexity. The BDHLDA method is finally extended to the GMM domain, using the simpler TFC features during re-estimation to provide significantly improved computation speed. Experiments on NIST 2003 and 2007 LRE evaluation corpora show that TFC is more effective than SDC, and that the GMM-based BDHLDA results in lower equal error rate (EER) and minimum average cost (Cavg) than either TFC or SDC approaches.
机译:移位三角倒谱(SDC)是语言识别(LRE)广泛使用的特征提取。由于合并了多个帧,因此具有较高的上下文宽度,SDC的性能优于传统的增量和加速度特征向量。但是,它也将相关性引入到串联的特征向量中,这增加了冗余并可能降低后端分类器的性能。在本文中,我们首先提出了一个时频倒谱(TFC)特征向量,该向量是通过对倒谱矩阵执行时间离散余弦变换(DCT)并按锯齿形扫描顺序选择变换后的元素而获得的。除此之外,我们通过对整个倒频谱矩阵进行异方差线性判别分析(HLDA),提高了判别能力。通过利用块对角矩阵约束,大型HLDA问题随后被简化为几个较小的HLDA问题,从而创建了具有较低计算复杂度的块对角HLDA(BDHLDA)算法。最终,将BDHLDA方法扩展到GMM域,从而在重新估计期间使用了更简单的TFC功能来显着提高计算速度。在NIST 2003和2007 LRE评估语料库上进行的实验表明,TFC比SDC更有效,并且基于GMM的BDHLDA比TFC或SDC方法具有更低的平均错误率(EER)和最低平均成本(Cavg)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号