首页> 外文会议>Conference on empirical methods in natural language processing >Automatic Poetry Generation with Mutual Reinforcement Learning
【24h】

Automatic Poetry Generation with Mutual Reinforcement Learning

机译:通过相互强化学习自动生成诗歌

获取原文

摘要

Poetry is one of the most beautiful forms of human language art. As a crucial step towards computer creativity, automatic poetry generation has drawn researchers' attention for decades. In recent years, some neural models have made remarkable progress in this task. However, they are all based on maximum likelihood estimation, which only learns common patterns of the corpus and results in loss-evaluation mismatch. Human experts evaluate poetry in terms of some specific criteria, instead of word-level likelihood. To handle this problem, we directly model the criteria and use them as explicit rewards to guide gradient update by reinforcement learning, so as to motivate the model to pursue higher scores. Besides, inspired by writing theories, we propose a novel mutual reinforcement learning schema. We simultaneously train two learners (generators) which learn not only from the teacher (rewarder) but also from each other to further improve performance. We experiment on Chinese poetry. Based on a strong basic model, our method achieves better results and outperforms the current state-of-the-art method.
机译:诗歌是人类语言艺术最美丽的形式之一。作为迈向计算机创造力的关键一步,自动诗歌生成已引起研究人员数十年的关注。近年来,一些神经模型在该任务中取得了显着进展。但是,它们全部基于最大似然估计,该最大似然估计仅学习语料库的常见模式并导致损失评估失配。人类专家根据某些特定标准而不是单词级别的可能性来评估诗歌。为了解决这个问题,我们直接对标准进行建模,并将其用作显式奖励,以通过强化学习来指导梯度更新,从而激励模型追求更高的分数。此外,受写作理论的启发,我们提出了一种新颖的相互强化学习模式。我们同时培训两名学习者(发电机),他们不仅向老师(奖励人)学习,而且还彼此学习,以进一步提高绩效。我们尝试中国诗歌。基于强大的基本模型,我们的方法取得了更好的结果,并且优于当前的最新方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号