首页> 外文会议>Odyssey 2010: the speaker and language recognition workshop >Improved Modeling of Cross-Decoder Phone Co-occurrences in SVM-based Phonotactic Language Recognition
【24h】

Improved Modeling of Cross-Decoder Phone Co-occurrences in SVM-based Phonotactic Language Recognition

机译:基于支持向量机的语音策略语言识别中的跨解码器电话共现的改进建模

获取原文
获取原文并翻译 | 示例

摘要

Most common approaches to phonotactic language recognition deal with several independent phone decod-ings. These decodings are processed and scored in a fully uncoupled way, their time alignment (and the information that may be extracted from it) being completely lost. Recently, a new approach to phonotactic language recognition has been presented [1], which takes into account time alignment information, by considering cross-decoder phone co-occurrences at the frame level, under two language modeling paradigms: smoothed n-grams and Support Vector Machines (SVM). Experiments on the NIST LRE2007 database demonstrated that using phone co-occurrence statistics could improve the performance of baseline phonotactic recognizers. In this paper, two variants of the cross-decoder phone co-occurrence SVM-based approach are proposed, by considering: (1) n-grams (up to 3-grams) of phone co-occurrences; and (2) co-occurrences of phone n-grams (up to 3-grams). To evaluate these approaches, a choice of open software (Brno University of Technology phone decoders, LIB-LINEAR and FoCal) was used, and experiments were carried out on the NIST LRE2007 database. Unlike those presented in [1], the two approaches presented in this paper outperformed the baseline phonotactic system, yielding around 16% relative improvement in terms of EER. The best fused system attained a 1,88% EER (a 30% improvement with regard to the baseline system), which supports the use of cross-decoder dependencies for language modeling.
机译:音趋语言识别的最常见方法是处理几种独立的电话编码。这些解码以完全不耦合的方式进行处理和评分,它们的时间对齐(以及可能从中提取的信息)完全丢失。最近,提出了一种新的音变语言识别方法[1],该方法通过在两种语言建模范例:平滑n元语法和支持下,在帧级别考虑跨解码器电话共现,考虑了时间对齐信息。向量机(SVM)。 NIST LRE2007数据库上的实验表明,使用电话共现统计信息可以提高基线音符识别器的性能。在本文中,通过考虑以下因素,提出了基于SVM的交叉解码器电话共现方法的两个变体:(1)n-gram(最多3克)的电话共现; (2)电话n克(最多3克)同时出现。为了评估这些方法,使用了开放软件(布尔诺工业大学电话解码器,LIB-LINEAR和FoCal)的选择,并在NIST LRE2007数据库上进行了实验。与[1]中介绍的方法不同,本文介绍的两种方法均优于基线音符策略系统,在EER方面产生了约16%的相对改进。最佳融合系统获得了1.88%的EER(相对于基准系统提高了30%),这支持使用交叉解码器依赖性进行语言建模。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号