首页> 外文期刊>Computer speech and language >To what extent does content selection affect surface realization in the context of headline generation?
【24h】

To what extent does content selection affect surface realization in the context of headline generation?

机译:内容选择在多大程度上影响了标题生成的背景下的表面实现?

获取原文
获取原文并翻译 | 示例

摘要

Headline generation is a task where the most important information of a news article is condensed and embodied into a single short sentence. This task is normally addressed by summarization techniques, ideally combining extractive and abstractive methods together with sentence compression or fusion techniques. Although Natural Language Generation (NLG) techniques have not been directly exploited for headline generation, they may provide better mechanisms than summarization techniques to paraphrase the information of a text. Therefore, this paper analyzes and evaluates the effectiveness of NLG techniques for generating headlines. In NLG, both content selection and surface realization are equally important-there is no point in generating text without knowing the topic. Considering this premise, we therefore take HanaNLG-a hybrid surface realization approach-as a basis, and we analyze the effect in the generated text when different content selection strategies are integrated at macroplanning stage. The experiments conducted show that, despite not using any sophisticated summarization method, the proposed approach provided the following benefits: ⅰ) it generated a coherent, linguistically structured headline; ⅱ) it obtained results on standard datasets (i.e., DUC 2003 and DUC 2004) that were comparable to several competitive systems, in terms of the content of the generated headline; and, ⅲ) the headlines generated by the whole approach (PLM-HanaNLG) were preferred by human assessors compared to those generated by the best performing system in DUC 2003.
机译:标题生成是一项任务,新闻文章的最重要信息被凝聚和体现成单一短句。该任务通常通过总结技术来解决,理想地将提取和抽象方法与句子压缩或融合技术相结合。虽然没有直接用于标题生成的自然语言生成(NLG)技术,但它们可以提供比概括技术更好的机制,以便释放文本信息。因此,本文分析并评估了NLG技术生成头条新闻的有效性。在NLG中,内容选择和曲面实现同样重要 - 在不知道主题的情况下不会产生文本的点。考虑到这一前提,我们采取了Hananlg-A混合表面实现方法 - 作为基础,我们在宏观悬挂阶段集成了不同的内容选择策略时,我们分析了所生成的文本中的效果。进行的实验表明,尽管没有使用任何复杂的摘要方法,所以提出的方法提供了以下好处:Ⅰ)它产生了一致的,语言结构的标题; Ⅱ)在标准数据集(即,DUC 2003和DUC 2004)上获得的结果与所产生的标题的内容相当的标准数据集(即,DUC 2003和DUC 2004);并且,与DUC 2003中最佳性能系统产生的那些相比,通过人评估师优选由整个方法(PLM-HananlG)产生的头条新闻。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号