首页> 外文期刊>Computer speech and language >Speaker-adapted confidence measures for speech recognition of video lectures
【24h】

Speaker-adapted confidence measures for speech recognition of video lectures

机译:演讲者适应性强的视频演讲语音识别措施

获取原文
获取原文并翻译 | 示例

摘要

Automatic speech recognition applications can benefit from a confidence measure (CM) to predict the reliability of the output. Previous works showed that a word-dependent naive Bayes (NB) classifier outperforms the conventional word posterior probability as a CM. However, a discriminative formulation usually renders improved performance due to the available training techniques. Taking this into account, we propose a logistic regression (LR) classifier defined with simple input functions to approximate to the NB behaviour. Additionally, as a main contribution, we propose to adapt the CM to the speaker in cases in which it is possible to identify the speakers, such as online lecture repositories. The experiments have shown that speaker-adapted models outperform their non-adapted counterparts on two difficult tasks from English (videoLectures.net) and Spanish (poliMedia) educational lectures. They have also shown that the NB model is clearly superseded by the proposed LR classifier.
机译:自动语音识别应用程序可以受益于置信度(CM)来预测输出的可靠性。先前的工作表明,以词为基础的朴素贝叶斯(NB)分类器的效果优于传统的词后验概率(CM)。但是,由于可用的训练技术,区分性配方通常可以提高性能。考虑到这一点,我们提出了一个用简单输入函数定义的逻辑回归(LR)分类器,以近似于NB行为。另外,作为主要贡献,我们建议在可能确定发言人的情况下,例如在线讲座资料库,使CM适合发言人。实验表明,在英语(videoLectures.net)和西班牙语(poliMedia)教育讲座这两项艰巨的任务上,适应说话者的模型要优于不适应说话者的模型。他们还表明,NB模型明显被拟议的LR分类器所取代。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号