首页> 外文会议>Annual international ACM SIGIR conference on Research and development in information retrieval >Automatic generation of concise summaries of spoken dialogues in unrestricted domains
【24h】

Automatic generation of concise summaries of spoken dialogues in unrestricted domains

机译:在不受限制的域中自动生成简明对话的简明摘要

获取原文

摘要

Automatic summarization of open domain spoken dialogues is a new research area. This paper introduces the task, the challenges involved, and presents an approach to obtain automatic extract summaries for multi-party dialogues of four different genres, without any restriction on domain. We address the following issues which are intrinsic to spoken dialogue summarization and typically can be ignored when summarizing written text such as newswire data: (i) detection and removal of speech disfluencies; (ii) detection and insertion of sentence boundaries; (iii) detection and linking of cross-speaker information units (question-answer pairs). A global system evaluation using a corpus of 23 relevance annotated dialogues containing 80 topical segments shows that for the two more informal genres, our summarization system using dialogue specific components significantly outperforms a baseline using TFIDF term weighting with maximum marginal relevance ranking (MMR).

机译:>开放域名的自动摘要说话是一个新的研究区域。本文介绍了这项任务,涉及的挑战,并提出了一种方法来获得四种不同类型的多方对话的自动提取摘要,没有任何对域的限制。我们解决了以下问题,这些问题是口语对话摘要,通常可以忽略总结新闻版数据等书面文本时:(i)检测和删除语音混乱; (ii)检测和插入句子边界; (iii)检测和连接跨扬声器信息单元(问答对)。使用包含80个局部段的23个相关性的语料库的全局系统评估表明,对于两个更具非正式类型,我们使用对话特定组分的总结系统使用TFIDF术语加权具有最大边际相关性排序(MMR)的基线。< / p>

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号