Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation

机译：基于子模具优化的多种释义及其在数据增强中的有效性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Inducing diversity in the task of paraphrasing is an important problem in NLP with applications in data augmentation and conversational agents. Previous paraphrasing approaches have mainly focused on the issue of generating semantically similar paraphrases. while paying little attention towards diversity. In fact, most of the methods rely solely on top-k beam search sequences to obtain a set of paraphrases. The resulting set. however, contains many structurally similar sentences. In this work, we focus on the task of obtaining highly diverse paraphrases while not compromising on paraphrasing quality. We provide a novel formulation of the problem in terms of monotone submodular function maximization. specifically targeted towards the task of paraphrasing. Additionally, we demonstrate the effectiveness of our method for data augmentation on multiple tasks such as intent classification and paraphrase recognition. In order to drive further research, we have made the source code available.

机译：诱导释义任务中的多样性是NLP中的一个重要问题，具有数据增强和会话代理的应用。以前的解释方法主要集中在发行语义上类似的释义的问题。虽然几乎没有注意到多样性。实际上，大多数方法完全依赖于Top-K光束搜索序列以获得一组释义。得到的集合。但是，包含许多结构上类似的句子。在这项工作中，我们专注于获得高度多样化释义的任务，同时没有损害释义质量。我们在单调子模芯功能最大化方面提供了对问题的新制定。专门针对解释的任务。此外，我们展示了我们对多个任务的数据增强方法的有效性，例如意图分类和解释识别。为了推动进一步的研究，我们已成为可用的源代码。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2019年|xciii p. 3498-4195|共11页
会议地点
作者
Ashutosh Kumar; Satwik Bhattamishra; Manik Bhandari; Partha Talukdar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序设计、软件工程;
关键词

相似文献

外文文献
中文文献
专利

1. A multi-cascaded model with data augmentation for enhanced paraphrase detection in short texts [J] . Muhammad Haroon Shakeel, Asim Karim, Imdadullah Khan Information Processing & Management . 2020,第3期

机译：具有数据增强功能的多级模型，可增强短文本中的复述检测
2. Robust optimization-based heuristic algorithm for the chance-constrained knapsack problem using submodularity [J] . Optimization Letters . 2020,第1期

机译：基于稳健的优化的启发式算法，用于使用子骨折的机会约束背包问题
3. Leveraging Neural Caption Translation with Visually Grounded Paraphrase Augmentation [J] . Johanes EFFENDI, Sakriani SAKTI, Katsuhito SUDOH, IEICE transactions on information and systems . 2020,第3期

机译：利用视觉接地释义增强的神经标题翻译
4. Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation [C] . Ashutosh Kumar, Satwik Bhattamishra, Manik Bhandari, Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2019

机译：基于子模块优化的多元释义及其在数据扩充中的有效性
5. Contactless Smartphone Camera-Based Heart Rate Estimation from Facial Videos for Diverse Subject Skin Tones and Scenes Using Synthetic Augmentation [D] . Karinca, Kerim Doruk. 2021

机译：基于非接触式智能手机相机的心率估算，用于各种对象肤色的面部视频和使用合成增强的场景
6. Choosing non-redundant representative subsets of protein sequence data sets using submodular optimization [O] . Maxwell W. Libbrecht, Jeffrey A. Bilmes, William Stafford Noble -1

机译：使用亚模优化选择蛋白质序列数据集的非冗余代表性子集
7. A multi-cascaded model with data augmentation for enhanced paraphrase detection in short texts [O] . Muhammad Haroon Shakeel, Asim Karim, Imdadullah Khan 2020

机译：具有数据增强的多级联模型，用于增强简短文本中的释义检测
8. Submodularity Framework for Data Subset Selection. [R] . Kirchhoff, K., Bilmes, J., Wei, K., 2013

机译：数据子集选择的子模块框架。

Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation

摘要

著录项

相似文献

相关主题

期刊订阅