...
首页> 外文期刊>IEEE Transactions on Speech and Audio Proceessing >Data Mining for Detecting Errors in Dictation Speech Recognition
【24h】

Data Mining for Detecting Errors in Dictation Speech Recognition

机译:用于听写语音识别中的错误检测的数据挖掘

获取原文
获取原文并翻译 | 示例
           

摘要

The efficiency promised by a dictation speech recognition (DSR) system is lessened by the need for correcting recognition errors. Error detection is the precursor of error correction. Developing effective techniques for error detection can thus lead to improved error correction. Current research on error detection has focused mainly on transcription and/or domain-specific speech. Error detection in DSR has been studied less. We propose data mining models for detecting errors in DSR. Instead of relying on internal parameters from DSR systems, we propose a loosely coupled approach to error detection based on features extracted from the DSR output. The features mainly came from two sources: confidence scores and linguistics parsing. Link grammar was innovatively applied to error detection. Three data mining techniques, including Na?ve Bayes, neural networks, and Support Vector Machines (SVMs), were evaluated on 5M DSR corpora. The experimental results showed that significant performance was achieved in that F-measures for error detection ranged from 55.3% to 62.5%. This study provided insights into the merit of different data-mining techniques and different types of features in error detection.
机译:听写语音识别(DSR)系统所承诺的效率因需要纠正识别错误而降低。错误检测是纠错的前提。因此,开发有效的错误检测技术可以改善纠错能力。当前关于错误检测的研究主要集中在转录和/或特定领域的语音。对DSR中的错误检测的研究较少。我们提出了数据挖掘模型来检测DSR中的错误。代替依赖DSR系统的内部参数,我们基于从DSR输出中提取的特征,提出了一种松散耦合的错误检测方法。这些功能主要来自两个来源:置信度得分和语言学分析。链接语法被创新地应用于错误检测。在5M DSR语料库上评估了三种数据挖掘技术,包括朴素贝叶斯,神经网络和支持向量机(SVM)。实验结果表明,通过用于错误检测的F值范围从55.3%到62.5%,可以实现显着的性能。这项研究为错误检测中不同的数据挖掘技术和不同类型的功能的优点提供了见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号