【24h】

Improving ASR error detection with non-decoder based features

机译:通过基于非解码器的功能改善ASR错误检测

获取原文

摘要

This study reports error detection experiments in large vocabulary automatic speech recognition (ASR) systems, by using statistical classifiers. We explored new features gathered from other knowledge sources than the decoder itself: a binary feature that compares outputs from two different ASR systems (word by word), a feature based on the number of hits of the hypothesized bigrams, obtained by queries entered into a very popular Web search engine, and finally a feature related to automatically infered topics at sentence and word levels. Experiments were conducted on a European Portuguese broadcast news corpus. The combination of baseline decoder-based features and two of these additional features led to significant improvements, from 13.87% to 12.16% classification error rate (CER) with a maximum entropy model, and from 14.01% to 12.39% CER with linear-chain conditional random fields, comparing to a baseline using only decoder-based features.
机译:这项研究报告了使用统计分类器在大型词汇自动语音识别(ASR)系统中进行错误检测的实验。我们探索了从解码器本身以外的其他知识来源收集来的新功能:一种二进制功能,用于比较两个不同ASR系统(逐字)的输出,该功能基于假设的双字母组合的匹配次数,该特征是通过输入查询中获得的非常流行的网络搜索引擎,最后是与句子和单词级别的自动推断主题相关的功能。实验是在欧洲葡萄牙广播新闻语料库上进行的。基于基线解码器的功能和其中两个附加功能的组合带来了显着的改进,最大熵模型的分类错误率(CER)从13.87%提高到12.16%,而线性链条件条件的CER从14.01%降低到12.39%随机字段,与仅使用基于解码器的功能与基线进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号