【24h】

Grammar Variational Autoencoder

机译:语法变分自动编码器

获取原文
       

摘要

Deep generative models have been wildly successful at learning coherent latent representations for continuous data such as natural images, artwork, and audio. However, generative modeling of discrete data such as arithmetic expressions and molecular structures still poses significant challenges. Crucially, state-of-the-art methods often produce outputs that are not valid. We make the key observation that frequently, discrete data can be represented as a parse tree from a context-free grammar. We propose a variational autoencoder which directly encodes from and decodes to these parse trees, ensuring the generated outputs are always syntactically valid. Surprisingly, we show that not only does our model more often generate valid outputs, it also learns a more coherent latent space in which nearby points decode to similar discrete outputs. We demonstrate the effectiveness of our learned models by showing their improved performance in Bayesian optimization for symbolic regression and molecule generation.
机译:深度生成模型在学习连续数据(如自然图像,艺术品和音频)的连贯的潜在表示形式方面已取得了巨大的成功。但是,离散数据的生成建模(例如算术表达式和分子结构)仍然带来巨大挑战。最重要的是,最先进的方法通常会产生无效的输出。我们的主要观察结果是,根据上下文无关的语法,经常可以将离散数据表示为解析树。我们提出了一种可变自动编码器,可以直接从这些解析树进行编码和解码,以确保生成的输出在语法上始终有效。令人惊讶的是,我们表明,我们的模型不仅更经常产生有效的输出,而且还学习了一个更连贯的潜在空间,在该空间中,附近的点解码为相似的离散输出。我们通过在符号回归和分子生成的贝叶斯优化中显示改进后的性能来证明我们学习的模型的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号