【24h】

Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification

机译:复杂度加权损失和不同等级的句子简化

获取原文

摘要

Sentence simplification is the task of rewriting texts so they arc easier to understand. Recent research has applied sequence-to-sequence (Seq2Seq) models to this task, focusing largely on training-time improvements via reinforcement learning and memory augmentation. One of the main problems with applying generic Seq2Seq models for simplification is that these models tend to copy directly from the original sentence, resulting in outputs that are relatively long and complex. We aim to alleviate this issue through the use of two main techniques. First, we incorporate content word complexities, as predicted with a leveled word complexity model, into our loss function during training. Second, we generate a large set of diverse candidate simplifications at test time. and rerank these to promote fluency, adequacy, and simplicity. Here, we measure simplicity through a novel sentence complexity model. These extensions allow our models to perform competitively with state-of-the-art systems while generating simpler sentences. We report standard automatic and human evaluation metrics.
机译:句子简化是重写文本的任务,使它们更容易理解。最近的研究已经将序列到序列(SEQ2Seq)模型应用于此任务,主要关注通过加强学习和内存增强的培训时间改进。应用通用SEQ2SEQ模型的主要问题之一是简化的,这些模型倾向于直接从原始句子复制,导致输出相对较长且复杂。我们的目标是通过使用两种主要技术来缓解这个问题。首先,我们将内容文字复杂性纳入培训期间使用级别的文字复杂性模型预测到我们的损耗功能。其次,我们在测试时间生成大量不同的候选简化。并重新评估这些以促进流利,充足和简单性。在这里,我们通过新颖的句子复杂性模型来衡量简单性。这些扩展允许我们的模型在生成更简单的句子的同时竞争地区执行最先进的系统。我们报告标准的自动和人类评估度量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号