首页> 外文会议>IEEE Workshop on Spoken Language Technology >The influence of automatic speech recognition accuracy on the performance of an automated speech assessment system
【24h】

The influence of automatic speech recognition accuracy on the performance of an automated speech assessment system

机译:自动语音识别准确性对自动语音评估系统性能的影响

获取原文

摘要

The effectiveness of automated scoring systems for evaluating spoken language proficiency depends greatly on the quality of the automatic speech recognition (ASR) output that is used to calculate the features for the scoring model. In this paper, we examine the effects of ASR word error rate (WER) on the scores produced by a system for automated scoring of non-native English speaking proficiency, as well as on the scoring model features (especially content features) in order to demonstrate the impact of ASR improvements on the performance of the automated speech assessment system. Five different sets of transcriptions with varying degrees of WER ranging from 0% to 52% (including four sets of ASR hypotheses and manual transcriptions) were obtained for a dataset of spoken responses from a pilot administration of an assessment of non-native English speaking proficiency. The experimental results show that higher performing ASR leads to better performance in the automated assessment system; furthermore, the correlation between human and automated scores drops substantially with an increase in WER from 10.7% to 28.9%, whereas the correlation changes little within the following two ranges of WERs: 0% to 10.7% and 28.9% to 52%. A detailed analysis of the features used in the scoring model shows that the ASR errors have a bigger impact on the content features than the delivery and language use features.
机译:用于评估口语能力的自动评分系统的有效性在很大程度上取决于用于计算评分模型特征的自动语音识别(ASR)输出的质量。在本文中,我们研究了ASR单词错误率(WER)对非母语英语口语能力自动评分系统产生的评分的影响,以及评分模型特征(尤其是内容特征)的影响。演示ASR改进对自动语音评估系统性能的影响。从试点行政部门对非母语英语水平评估进行试点管理,获得了五组不同的转录物,其WER程度从0%到52%不等(包括四组ASR假设和人工转录)。 。实验结果表明,较高的ASR会导致自动评估系统中的性能更好;此外,随着WER从10.7%增至28.9%,人类和自动评分之间的相关性显着下降,而相关性在以下两个WER范围内变化很小:0%至10.7%和28.9%至52%。对评分模型中使用的功能的详细分析表明,ASR错误对内容功能的影响要大于交付和语言使用功能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号