首页> 外文会议>IEEE International Conference on Acoustics Speech and Signal;ICASSP 2010 >Using n-best recognition output for extractive summarization and keyword extraction in meeting speech
【24h】

Using n-best recognition output for extractive summarization and keyword extraction in meeting speech

机译:使用n最佳识别输出进行会议语音的提取摘要和关键字提取

获取原文

摘要

There has been increasing interest recently in meeting understanding, such as summarization, browsing, action item detection, and topic segmentation. However, there is very limited effort on using rich recognition output (e.g., recognition confidence measure or more recognition candidates) for these downstream tasks. This paper presents an initial study using n-best recognition hypotheses for two tasks, extractive summarization and keyword extraction. We extend the approach used on 1-best output to n-best hypotheses: MMR (maximum marginal relevance) for summarization and TFIDF (term frequency, inverse document frequency) weighting for keyword extraction. Our experiments on the ICSI meeting corpus demonstrate promising improvement using n-best hypotheses over 1-best output. These results suggest worthy future studies using n-best or lattices as the interface between speech recognition and downstream tasks.
机译:最近,人们对满足会议摘要(例如摘要,浏览,动作项检测和主题细分)的兴趣越来越高。但是,对于这些下游任务使用丰富的识别输出(例如,识别置信度度量或更多识别候选)的工作非常有限。本文提出了使用n-最佳识别假设进行两项任务的初步研究,即提取摘要和关键字提取。我们将用于1个最佳输出的方法扩展到n个最佳假设:用于汇总的MMR(最大边际相关性)和用于关键词提取的TFIDF(项频率,文档反频率)加权。我们在ICSI会议语料库上的实验表明,使用n最佳假设优于1最佳输出可以改善前景。这些结果表明,值得未来的研究使用n最佳或格作为语音识别和下游任务之间的接口。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号