Automatic summarization of open domain spoken dialogues is a new research area. This paper introduces the task, the challenges involved, and presents an approach to obtain automatic extract summaries for multi-party dialogues of four different genres, without any restriction on domain. We address the following issues which are intrinsic to spoken dialogue summarization and typically can be ignored when summarizing written text such as newswire data: (i) detection and removal of speech disfluencies; (ii) detection and insertion of sentence boundaries; (iii) detection and linking of cross-speaker information units (question-answer pairs). A global system evaluation using a corpus of 23 relevance annotated dialogues containing 80 topical segments shows that for the two more informal genres, our summarization system using dialogue specific components significantly outperforms a baseline using TFIDF term weighting with maximum marginal relevance ranking (MMR).
机译:根据医疗计划和本体自动生成语音对话。
机译:自动训练语音对话系统中有问题的对话预测器
机译:自动训练语音对话系统中有问题的对话预测器
机译:在不受限制的领域中自动生成口语对话的简要摘要
机译:作为对话伙伴的对话系统:将对话行为理论应用于自然语言生成,以实现面向任务的混合式口语对话。
机译:在家庭血液透析领域自动处理口语对话
机译:在不受限制的域中自动生成语音对话的简要摘要