首页> 外文期刊>Selected Topics in Signal Processing, IEEE Journal of >Multilevel and Session Variability Compensated Language Recognition: ATVS-UAM Systems at NIST LRE 2009
【24h】

Multilevel and Session Variability Compensated Language Recognition: ATVS-UAM Systems at NIST LRE 2009

机译:多级和会话可变性补偿的语言识别:NVS LRE 2009上的ATVS-UAM系统

获取原文
获取原文并翻译 | 示例

摘要

This paper presents the systems submitted by the ATVS Biometric Recognition Group to the 2009 Language Recognition Evaluation (LRE'09), organized by NIST. New challenges included in this LRE edition can be summarized by three main differences with respect to past evaluations. First, the number of languages to be recognized expanded to 23 languages from 14 in 2007, and 7 in 2005. Second, the data variability has been increased by including telephone speech excerpts extracted from Voice of America (VOA) radio broadcasts through Internet in addition to conversational telephone speech (CTS). The third difference was the volume of data, involving in this evaluation up to 2 terabytes of speech data for development, which is an order of magnitude greater than past evaluations. LRE'09 thus required participants to develop robust systems able not only to successfully face the session variability problem but also to do it with reasonable computational resources. ATVS participation consisted of state-of-the-art acoustic and high-level systems focussing on these issues. Furthermore, the problem of finding a proper combination and calibration of the information obtained at different levels of the speech signal was widely explored in this submission. In this paper, two original contributions were developed. The first contribution was applying a session variability compensation scheme based on factor analysis (FA) within the statistics domain into a SVM-supervector (SVM-SV) approach. The second contribution was the employment of a novel back-end based on anchor models in order to fuse individual systems prior to one-versus-all calibration via logistic regression. Results both in development and evaluation corpora show the robustness and excellent performance of the submitted systems, exemplified by our system ranked second in the 30-second open-set condition, with remarkably scarce computational resources.
机译:本文介绍了ATVS生物识别小组向NIST组织的2009语言识别评估(LRE'09)提交的系统。与过去的评估相比,此LRE版本中包含的新挑战可以概括为三个主要差异。首先,可识别的语言数量从2007年的14种和2005年的7种增加到23种语言。其次,数据可变性增加了,它包括通过互联网从美国之音(VOA)广播中摘录的电话语音摘录。对话电话语音(CTS)。第三个差异是数据量,在此评估中涉及多达2 TB的语音数据用于开发,这比过去的评估大了一个数量级。因此,LRE'09要求参与者开发健壮的系统,这些系统不仅能够成功面对会话可变性问题,而且还可以使用合理的计算资源来做到这一点。 ATVS的参与包括关注这些问题的最新声学和高级系统。此外,在本申请中,广泛探讨了寻找在语音信号的不同水平上获得的信息的适当组合和校准的问题。在本文中,开发了两个原始的贡献。第一个贡献是将基于统计域内因子分析(FA)的会话可变性补偿方案应用于SVM-超向量(SVM-SV)方法。第二个贡献是采用了基于锚模型的新型后端,以便在通过逻辑回归进行所有对所有的校准之前融合各个系统。开发和评估语料库的结果都显示了提交系统的鲁棒性和出色的性能,以我们的系统为例,该系统在30秒开放设置条件下排名第二,而计算资源却非常稀缺。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号