【24h】

Neural semi-Markov CRF for Monolingual Word Alignment

机译:神经半马尔可夫CRF用于单声道词对齐

获取原文

摘要

Monolingual word alignment is important for studying fine-grained editing operations (i.e., deletion, addition, and substitution) in text-to-text generation tasks, such as paraphrase generation, text simplification, neutralizing biased language, etc. In this paper, we present a novel neural semi-Markov CRF alignment model, which unifies word and phrase alignments through variable-length spans. We also create a new benchmark with human annotations that cover four different text genres to evaluate monolingual word alignment models in more realistic settings. Experimental results show that our proposed model outperforms all previous approaches for monolingual word alignment as well as a competitive QA-based baseline, which was previously only applied to bilingual data. Our model demonstrates good generalizability to three out-of-domain datasets and shows great utility in two downstream applications: automatic text simplification and sentence pair classification tasks.
机译:单语对齐对于在文本到文本生成任务中研究细粒度编辑操作(即删除,加法和替换)是重要的,例如释义生成,文本简化,中和偏置语言等。在本文中,我们 提出了一种新型神经半标率CRF对准模型,其通过可变长度跨度统一字和短语对齐。 我们还创建了一个新的基准,具有人类注释,涵盖四种不同的文本类型,以评估更现实的设置中的单声道词对齐模型。 实验结果表明,我们提出的模型优于单语对齐的所有先前方法以及以前仅应用于双语数据的基于竞争性的QA基准。 我们的模型对三个域名数据集进行了良好的普遍性,并在两个下游应用中显示出很大的实用程序:自动文本简化和句子对分类任务。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号