Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation

机译：基于子模块优化的多元释义及其在数据扩充中的有效性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Inducing diversity in the task of paraphrasing is an important problem in NLP with applications in data augmentation and conversational agents. Previous paraphrasing approaches have mainly focused on the issue of generating semantically similar paraphrases. while paying little attention towards diversity. In fact, most of the methods rely solely on top-k beam search sequences to obtain a set of paraphrases. The resulting set. however, contains many structurally similar sentences. In this work, we focus on the task of obtaining highly diverse paraphrases while not compromising on paraphrasing quality. We provide a novel formulation of the problem in terms of monotone submodular function maximization. specifically targeted towards the task of paraphrasing. Additionally, we demonstrate the effectiveness of our method for data augmentation on multiple tasks such as intent classification and paraphrase recognition. In order to drive further research, we have made the source code available.

机译：在释义的任务中引起多样性是NLP在数据增强和会话代理中的应用中的一个重要问题。先前的释义方法主要集中在生成语义相似的释义的问题上。同时很少关注多样性。实际上，大多数方法仅依靠top-k波束搜索序列来获取一组复述。结果集。但是，包含许多结构相似的句子。在这项工作中，我们专注于获得高度多样化的释义的任务，同时又不影响释义质量。我们提供有关单调亚模函数最大化的问题的新颖表述。专门针对释义的任务。此外，我们证明了我们的方法在多个任务（例如意图分类和释义识别）上的数据扩充的有效性。为了推动进一步的研究，我们提供了源代码。

著录项

来源
《Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies》|2019年|3609-3619|共11页
会议地点
作者
Ashutosh Kumar; Satwik Bhattamishra; Manik Bhandari; Partha Talukdar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A multi-cascaded model with data augmentation for enhanced paraphrase detection in short texts [J] . Muhammad Haroon Shakeel, Asim Karim, Imdadullah Khan Information Processing & Management . 2020,第3期

机译：具有数据增强功能的多级模型，可增强短文本中的复述检测
2. Robust optimization-based heuristic algorithm for the chance-constrained knapsack problem using submodularity [J] . Optimization Letters . 2020,第1期

机译：基于稳健的优化的启发式算法，用于使用子骨折的机会约束背包问题
3. Leveraging Neural Caption Translation with Visually Grounded Paraphrase Augmentation [J] . Johanes EFFENDI, Sakriani SAKTI, Katsuhito SUDOH, IEICE transactions on information and systems . 2020,第3期

机译：利用视觉接地释义增强的神经标题翻译
4. Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation [C] . Ashutosh Kumar, Satwik Bhattamishra, Manik Bhandari, Conference on the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . 2019

机译：基于子模具优化的多种释义及其在数据增强中的有效性
5. Contactless Smartphone Camera-Based Heart Rate Estimation from Facial Videos for Diverse Subject Skin Tones and Scenes Using Synthetic Augmentation [D] . Karinca, Kerim Doruk. 2021

机译：基于非接触式智能手机相机的心率估算，用于各种对象肤色的面部视频和使用合成增强的场景
6. Choosing non-redundant representative subsets of protein sequence data sets using submodular optimization [O] . Maxwell W. Libbrecht, Jeffrey A. Bilmes, William Stafford Noble -1

机译：使用亚模优化选择蛋白质序列数据集的非冗余代表性子集
7. A multi-cascaded model with data augmentation for enhanced paraphrase detection in short texts [O] . Muhammad Haroon Shakeel, Asim Karim, Imdadullah Khan 2020

机译：具有数据增强的多级联模型，用于增强简短文本中的释义检测
8. Submodularity Framework for Data Subset Selection. [R] . Kirchhoff, K., Bilmes, J., Wei, K., 2013

机译：数据子集选择的子模块框架。

Submodular Optimization-based Diverse Paraphrasing and its Effectiveness in Data Augmentation

摘要

著录项

相似文献

相关主题

期刊订阅