首页> 外文会议>Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies >Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation
【24h】

Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation

机译:基于子模块优化的多元释义及其在数据扩充中的有效性

获取原文

摘要

Inducing diversity in the task of paraphrasing is an important problem in NLP with applications in data augmentation and conversational agents. Previous paraphrasing approaches have mainly focused on the issue of generating semantically similar paraphrases. while paying little attention towards diversity. In fact, most of the methods rely solely on top-k beam search sequences to obtain a set of paraphrases. The resulting set. however, contains many structurally similar sentences. In this work, we focus on the task of obtaining highly diverse paraphrases while not compromising on paraphrasing quality. We provide a novel formulation of the problem in terms of monotone submodular function maximization. specifically targeted towards the task of paraphrasing. Additionally, we demonstrate the effectiveness of our method for data augmentation on multiple tasks such as intent classification and paraphrase recognition. In order to drive further research, we have made the source code available.
机译:在释义的任务中引起多样性是NLP在数据增强和会话代理中的应用中的一个重要问题。先前的释义方法主要集中在生成语义相似的释义的问题上。同时很少关注多样性。实际上,大多数方法仅依靠top-k波束搜索序列来获取一组复述。结果集。但是,包含许多结构相似的句子。在这项工作中,我们专注于获得高度多样化的释义的任务,同时又不影响释义质量。我们提供有关单调亚模函数最大化的问题的新颖表述。专门针对释义的任务。此外,我们证明了我们的方法在多个任务(例如意图分类和释义识别)上的数据扩充的有效性。为了推动进一步的研究,我们提供了源代码。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号