首页> 外文会议>International conference on multimodal interfaces and workshop on machine learning for multimodal interfaces 2009 >Cache-based Language Model Adaptation using Visual Attention for ASR in Meeting Scenarios
【24h】

Cache-based Language Model Adaptation using Visual Attention for ASR in Meeting Scenarios

机译:在会议场景中使用Visual Attention for ASR的基于缓存的语言模型自适应

获取原文
获取外文期刊封面目录资料

摘要

In a typical group meeting involving discussion and collaboration, people look at one another, at shared information resources such as presentation material, and also at nothing in particular. In this work we investigate whether the knowledge of what a person is looking at may improve the performance of Automatic Speech Recognition (ASR). A framework for cache Language Model (LM) adaptation is proposed with the cache based on a person's Visual Attention (VA) sequence. The framework attempts to measure the appropriateness of adaptation from VA sequence characteristics. Evaluation on the AMI Meeting corpus data shows reduced LM perplexity. This work demonstrates the potential for cache-based LM adaptation using VA information in large vocabulary ASR deployed in meeting scenarios.
机译:在一个包含讨论和协作的典型小组会议中,人们互相看着,在共享的信息资源(例如演示材料)上互相看,也没有什么特别的。在这项工作中,我们调查了一个人在看什么的知识是否可以改善自动语音识别(ASR)的性能。提出了一种基于人的视觉注意(VA)序列的缓存语言模型(LM)适应框架。该框架试图从VA序列特征来衡量适应的适当性。对AMI Meeting语料库数据的评估显示,LM的困惑度有所降低。这项工作展示了在会议场景中部署的大词汇量ASR中使用VA信息进行基于缓存的LM适应的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号