A Learning-Exploring Method to Generate Diverse Paraphrases with Multi-Objective Deep Reinforcement Learning

机译：一种学习探索方法，以产生多目标深度加强学习的多样化释义

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Paraphrase generation (PG) is of great importance to many downstream tasks in natural language processing. Diversity is an essential nature to PG for enhancing generalization capability and robustness of downstream applications. Recently, neural sequence-to-sequence (Seq2Seq) models have shown promising results in PG. However, traditional model training for PG focuses on optimizing model prediction against single reference and employs cross-entropy loss, which objective is unable to encourage model to generate diverse paraphrases. In this work, we present a novel approach with multi-objective learning to PG. We propose a learning-exploring method to generate sentences as learning objectives from the learned data distribution, and employ reinforcement learning to combine these new learning objectives for model training. We first design a sample-based algorithm to explore diverse sentences. Then we introduce several reward functions to evaluate the sampled sentences as learning signals in terms of expressive diversity and semantic fidelity, aiming to generate diverse and high-quality paraphrases. To effectively optimize model performance satisfying different evaluating aspects, we use a GradNorm-based algorithm that automatically balances these training objectives. Experiments and analyses on Quora and Twitter datasets demonstrate that our proposed method not only gains a significant increase in diversity but also improves generation quality over several state-of-the-art baselines.

机译：释义生成（PG）对自然语言处理中的许多下游任务非常重要。多样性是PG的基本性，用于提高下游应用的泛化能力和鲁棒性。最近，神经序列到序列（SEQ2Seq）模型已经显示出PG的有希望的结果。然而，PG的传统模型培训专注于优化模型预测对单引用并采用跨熵损失，目标无法鼓励模型产生多样化的释义。在这项工作中，我们提出了一种具有多目标学习的新方法。我们提出了一种学习探索方法，从学习的数据分发中生成句子，并采用加强学习来结合这些新的学习目标进行模型培训。我们首先设计一种基于样本的算法来探索各种句子。然后，我们介绍了几种奖励功能，以评估采样的句子作为表现因素和语义忠诚的学习信号，旨在产生多样化和高质量的释义。为了有效地优化满足不同评估方面的模型性能，我们使用基于Gradnorm的算法，自动平衡这些培训目标。 Quora和Twitter数据集的实验和分析表明，我们的提出方法不仅提高了多样性的显着增加，而且还提高了几个最先进的基线的发电质量。

著录项

来源
《International Conference on Computational Linguistics》|2020年|2310-2321|共12页
会议地点
作者
Mingtong Liu; Erguang Yang; Deyi Xiong; Yujie Zhang; Yao Meng; Changjian Hu; Jinan Xu; Yufeng Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Dynamic Beam Hopping Method Based on Multi-Objective Deep Reinforcement Learning for Next Generation Satellite Broadband Systems [J] . Hu Xin, Zhang Yuchen, Liao Xianglai, IEEE Transactions on Broadcasting . 2020,第3期

机译：基于多目标深增强学习的下一代卫星宽带系统的动态梁跳跃方法
2. Distributed deep reinforcement learning-based multi-objective integrated heat management method for water-cooling proton exchange membrane fuel cell [J] . Jiawen Li, Yaping Li, Tao Yu Case Studies in Thermal Engineering . 2020,第a期

机译：基于深度加强学习的水冷却质子交换膜燃料电池的多目标综合热管理方法
3. DeepGraphMolGen, a multi-objective, computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach [J] . Yash Khemchandani, Stephen O’Hagan, Soumitra Samanta, Journal of Cheminformatics . 2020,第1期

机译：深图摩尔，一种用于产生具有理想性质的分子的多目标，计算策略：图表卷积和增强学习方法
4. Paraphrase Generation with Deep Reinforcement Learning [C] . Zichao Li, Xin Jiang, Lifeng Shang, Conference on empirical methods in natural language processing . 2018

机译：具有深度强化学习的释义生成
5. Acquiring Diverse Robot Skills via Maximum Entropy Deep Reinforcement Learning [D] . Haarnoja, Tuomas. 2018

机译：通过最大熵深度强化学习掌握各种机器人技能
6. DeepGraphMolGen a multi-objective computational strategy for generating molecules with desirable properties: a graph convolution and reinforcement learning approach [O] . Yash Khemchandani, Stephen O’Hagan, Soumitra Samanta, 2020

机译：深图摩尔一种用于产生具有理想性质的分子的多目标计算策略：图表卷积和增强学习方法
7. Generating Behavior-Diverse Game AIs with Evolutionary Multi-Objective Deep Reinforcement Learning [O] . Ruimin Shen, Yan Zheng, Jianye Hao, 2020

机译：具有进化多目标深度加强学习的生成行为多样化的游戏AIS
8. Multi-Objective Reinforcement Learning-Based Deep Neural Networks for Cognitive Space Communications. [R] . Ferreria, P. V. R., Paffenroth, R., Wyglinski, A. M., 2017

机译：基于多目标强化学习的认知空间通信深度神经网络。

A Learning-Exploring Method to Generate Diverse Paraphrases with Multi-Objective Deep Reinforcement Learning

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅