首页> 外文会议>International Conference on Text, Speech and Dialogue >Speech-to-Text Summarization Using Automatic Phrase Extraction from Recognized Text
【24h】

Speech-to-Text Summarization Using Automatic Phrase Extraction from Recognized Text

机译:使用识别文本的自动短语提取语音到文本摘要

获取原文

摘要

This paper describes a summarization system that was developed in order to summarize news delivered orally. The system generates text summaries from input audio using three independent components: an automatic speech recognizer, a syntactic analyzer, and a summarizer. The absence of sentence boundaries in the recognized text complicates the summarization process. Therefore, we use a syntactic analyzer to identify continuous segments in the recognized text. We used 50 reference articles to perform our evaluation. The data are publicly available at http://nlp.ite.tul.cz/sumarizace. The results of the proposed system were compared with the results of sentence summarization in the reference articles. The evaluation was performed using co-occurrence of n-grams in the reference and generated summaries, and by readers mark-ups. The readers marked two aspects of the summaries: readability and information relevance. Experiments confirm that the generated summaries have the same information value as the reference summaries. However, readers state that phrase summaries are hard to read without the whole sentence context.
机译:本文介绍了一个开发的摘要系统,以便总结口头交付的新闻。该系统使用三个独立组件从输入音频生成文本摘要:自动语音识别器,句法分析仪和摘要。公认文本中没有句子边界会使总结过程复杂化。因此,我们使用句法分析仪识别公认的文本中的连续段。我们使用了50个参考文章来执行我们的评估。数据在http://nlp.ite.tul.cz/sumarizace上公开提供。将所提出的系统的结果与参考文章中的句子摘要结果进行比较。使用参考和产生的摘要中的N-GRAM的共同发生进行评估,以及读者标记。读者标志着摘要的两个方面:可读性和信息相关性。实验证实,所产生的摘要与参考摘要具有相同的信息价值。但是,读者说明短语摘要在没有整个句子上下文的情况下很难读取。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号