首页> 外文会议>Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies >Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation
【24h】

Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation

机译:基于子模具优化的多种释义及其在数据增强中的有效性

获取原文

摘要

Inducing diversity in the task of paraphrasing is an important problem in NLP with applications in data augmentation and conversational agents. Previous paraphrasing approaches have mainly focused on the issue of generating semantically similar paraphrases. while paying little attention towards diversity. In fact, most of the methods rely solely on top-k beam search sequences to obtain a set of paraphrases. The resulting set. however, contains many structurally similar sentences. In this work, we focus on the task of obtaining highly diverse paraphrases while not compromising on paraphrasing quality. We provide a novel formulation of the problem in terms of monotone submodular function maximization. specifically targeted towards the task of paraphrasing. Additionally, we demonstrate the effectiveness of our method for data augmentation on multiple tasks such as intent classification and paraphrase recognition. In order to drive further research, we have made the source code available.
机译:诱导释义任务中的多样性是NLP中的一个重要问题,具有数据增强和会话代理的应用。以前的解释方法主要集中在发行语义上类似的释义的问题。虽然几乎没有注意到多样性。实际上,大多数方法完全依赖于Top-K光束搜索序列以获得一组释义。得到的集合。但是,包含许多结构上类似的句子。在这项工作中,我们专注于获得高度多样化释义的任务,同时没有损害释义质量。我们在单调子模芯功能最大化方面提供了对问题的新制定。专门针对解释的任务。此外,我们展示了我们对多个任务的数据增强方法的有效性,例如意图分类和解释识别。为了推动进一步的研究,我们已成为可用的源代码。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号