首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Combining Relevance Language Modeling and Clarity Measure for Extractive Speech Summarization
【24h】

Combining Relevance Language Modeling and Clarity Measure for Extractive Speech Summarization

机译:结合相关语言建模和清晰度度量以提取语音摘要

获取原文
获取原文并翻译 | 示例

摘要

Extractive speech summarization, which purports to select an indicative set of sentences from a spoken document so as to succinctly represent the most important aspects of the document, has garnered much research over the years. In this paper, we cast extractive speech summarization as an ad-hoc information retrieval (IR) problem and investigate various language modeling (LM) methods for important sentence selection. The main contributions of this paper are four-fold. First, we explore a novel sentence modeling paradigm built on top of the notion of relevance, where the relationship between a candidate summary sentence and a spoken document to be summarized is discovered through different granularities of context for relevance modeling. Second, not only lexical but also topical cues inherent in the spoken document are exploited for sentence modeling. Third, we propose a novel clarity measure for use in important sentence selection, which can help quantify the thematic specificity of each individual sentence that is deemed to be a crucial indicator orthogonal to the relevance measure provided by the LM-based methods. Fourth, in an attempt to lessen summarization performance degradation caused by imperfect speech recognition, we investigate making use of different levels of index features for LM-based sentence modeling, including words, subword-level units, and their combination. Experiments on broadcast news summarization seem to demonstrate the performance merits of our methods when compared to several existing well-developed and/or state-of-the-art methods.
机译:提取语音摘要旨在从语音文档中选择一组指示性句子,以简洁地代表文档的最重要方面,这些年来,它已经获得了很多研究。在本文中,我们将提取语音摘要作为临时信息检索(IR)问题,并研究了用于重要句子选择的各种语言建模(LM)方法。本文的主要贡献有四个方面。首先,我们探索建立在相关性概念之上的新颖的句子建模范式,其中候选摘要句子与要概括的语音文档之间的关系是通过不同的关联性上下文上下文粒度发现的。其次,不仅是语音文档中固有的词汇提示,而且还有话题提示都被用于句子建模。第三,我们提出了一种用于重要句子选择的新颖的清晰度度量,它可以帮助量化每个单独句子的主题特异性,这被认为是与基于LM的方法提供的相关性度量正交的关键指标。第四,为了减少由于不完善的语音识别而导致的摘要性能下降,我们研究了在基于LM的句子建模中使用不同级别的索引功能,包括单词,子单词级别的单元及其组合。与几种现有的完善的和/或最新的方法相比,广播新闻摘要的实验似乎证明了我们方法的性能优点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号