...
首页> 外文期刊>Audio, Speech, and Language Processing, IEEE Transactions on >Extractive Speech Summarization Using Shallow Rhetorical Structure Modeling
【24h】

Extractive Speech Summarization Using Shallow Rhetorical Structure Modeling

机译:浅层修辞结构建模的语音提取摘要

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

We propose an extractive summarization approach with a novel shallow rhetorical structure learning framework for speech summarization. One of the most under-utilized features in extractive summarization is hierarchical structure information-semantically cohesive units that are hidden in spoken documents. We first present empirical evidence that rhetorical structure is the underlying semantic information, which is rendered in linguistic and acoustic/prosodic forms in lecture speech. A segmental summarization method, where the document is partitioned into rhetorical units by K-means clustering, is first proposed to test this hypothesis. We show that this system produces summaries at 67.36% ROUGE-L F-measure, a 4.29% absolute increase in performance compared with that of the baseline system. We then propose Rhetorical-State Hidden Markov Models (RSHMMs) to automatically decode the underlying hierarchical rhetorical structure in speech. Tenfold cross validation experiments are carried out on conference speeches. We show that system based on RSHMMs gives a 71.31% ROUGE-L F-measure, a 8.24% absolute increase in lecture speech summarization performance compared with the baseline system without using RSHMM. Our method equally outperforms the baseline with a conventional discourse feature. We also present a thorough investigation of the relative contribution of different features and show that, for lecture speech, speaker-normalized acoustic features give the most contribution at 68.5% ROUGE-L F-measure, compared to 62.9% ROUGE-L F-measure for linguistic features, and 59.2% ROUGE-L F-measure for un-normalized acoustic features. This shows that the individual speaking style of each speaker is highly relevant to the summarization.
机译:我们提出一种具有新颖的浅修辞结构学习框架的语音摘要提取摘要方法。摘录摘要中使用最不充分的功能之一是隐藏在语音文档中的层次结构信息—语义上的内聚单元。我们首先提供实证证据,即修辞结构是基础的语义信息,在演讲中以语言,声学/韵律形式呈现。首先提出一种分段摘要方法,通过K-means聚类将文档划分为修辞单元,以检验该假设。我们表明,该系统产生的摘要为ROUGE-L F-measure的67.36%,与基准系统相比,性能的绝对提高了4.29%。然后,我们提出了修辞状态隐藏马尔可夫模型(RSHMM),以自动解码语音中的基础分层修辞结构。在会议演讲上进行了十倍交叉验证实验。我们显示,与不使用RSHMM的基线系统相比,基于RSHMM的系统给出了71.31%的ROUGE-L F量度,演讲语音摘要性能的绝对提高了8.24%。我们的方法在常规话语功能方面同样胜过基线。我们还对不同功能的相对贡献进行了深入研究,结果表明,在演讲中,说话人归一化声学特征在ROUGE-L F-measure的贡献为68.5%,而ROUGE-L F-measure的贡献为62.9%。用于语言特征;对于未归一化的声学特征,使用59.2%的ROUGE-L F度量。这表明每个发言人的个人讲话风格与摘要高度相关。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号