【24h】

Encode, Tag, Realize: High-Precision Text Editing

机译:编码,标记,实现:高精度文本编辑

获取原文
获取外文期刊封面目录资料

摘要

We propose LASERTAGGER—a sequence tagging approach that casts text generation as a text editing task. Target texts are reconstructed from the inputs using three main edit operations: keeping a token, deleting it, and adding a phrase before the token. To predict the edit operations, we propose a novel model, which combines a BERT encoder with an autoregressive Transformer decoder. This approach is evaluated on English text on four tasks: sentence fusion, sentence splitting, abstractive summarization, and grammar correction. LASERTAGGER achieves new state-of-the-art results on three of these tasks, performs comparably to a set of strong seq2seq baselines with a large number of training examples, and outperforms them when the number of examples is limited. Furthermore, we show that at inference time tagging can be more than two orders of magnitude faster than comparable seq2seq models, making it more attractive for running in a live environment.
机译:我们建议使用LASERTAGGER,这是一种序列标记方法,可将文本生成转换为文本编辑任务。使用三个主要的编辑操作从输入中重构目标文本:保留标记,删除标记以及在标记之前添加短语。为了预测编辑操作,我们提出了一个新颖的模型,该模型结合了BERT编码器和自回归Transformer解码器。该方法在英语文本上评估了以下四个任务:句子融合,句子拆分,抽象摘要和语法校正。 LASERTAGGER在其中的三个任务上获得了最新的技术成果,与具有大量训练示例的一组强大的seq2seq基线具有可比性,并且在示例数量有限的情况下,它们的性能优于其他。此外,我们证明了在推理时,标记可比同类seq2seq模型快两个数量级,这使其在现场环境中运行更具吸引力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号