Improving Deep Generative Models with Randomized SMILES

机译：使用随机SMILES改进深度生成模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A Recurrent Neural Network (RNN) trained with a set of molecules represented as SMILES strings can generate millions of different valid and meaningful chemical structures. In most of the reported architectures the models have been trained using a canonical (unique for each molecule) representation of SMILES. Instead, this research shows that when using randomized SMILES as a data amplification technique, a model can generate more molecules and those are going to accurately represent the training set properties. To show that, an extensive benchmark study has been conducted using research from a recently published article which shows that models trained with molecules from the GDB-13 database (975 million molecules) achieve better overall chemical space coverage when the posterior probability distribution is as uniform as possible. Specifically, we created models that generate nearly all the GDB-13 chemical space using only 1 million molecules as training set. Lastly, models were also trained with smaller training set sizes and show substantial improvement when using randomized SMILES compared to canonical.

机译：用一组表示为SMILES字符串的分子训练的递归神经网络（RNN）可以生成数百万个不同的有效和有意义的化学结构。在大多数已报告的体系结构中，已经使用SMILES的规范表示（每个分子唯一）来训练模型。相反，这项研究表明，将随机SMILES用作数据放大技术时，模型可以生成更多的分子，并且这些分子将准确地表示训练集的属性。为了表明这一点，已使用最近发表的文章进行的研究进行了广泛的基准研究，该研究表明，当后验概率分布均匀时，使用GDB-13数据库中的分子训练的模型（9.75亿个分子）可以获得更好的整体化学空间覆盖率尽可能。具体来说，我们创建了仅使用一百万个分子作为训练集即可生成几乎所有GDB-13化学空间的模型。最后，还使用较小的训练集大小对模型进行了训练，与使用规范的模型相比，使用随机的SMILES表现出了显着的改进。

著录项

来源
《International conference on artificial neural networks》|2019年|747-751|共5页
会议地点
作者
Josep Arus-Pous; Simon Johansson; Oleksii Prykhodko; Esben Jannik Bjerrum; Christian Tyrchan; Jean-Louis Reymond; Hongming Chen; Ola Engkvist;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Cheminformatics; Molecular generative models; Randomized SMILES; Molecular databases; Recurrent Neural Networks; Benchmarking;

机译：化学信息学;分子生成模型;随机微笑分子数据库;递归神经网络标杆管理;

相似文献

外文文献
中文文献
专利

1. SMILES-based deep generative scaffold decorator for de-novo drug design [J] . Josep Arús-Pous, Atanas Patronov, Esben Jannik Bjerrum, Journal of Cheminformatics . 2020,第1期

机译：基于微笑的De-Novo药物设计深生成脚手架装饰物
2. Improving exploration efficiency of deep reinforcement learning through samples produced by generative model [J] . Xu Dayong, Zhu Fei, Liu Quan, Expert systems with applications . 2021,第Deca期

机译：通过生成模型生产的样本提高深度增强学习的探索效率
3. A community-based intervention (Young SMILES) to improve the health-related quality of life of children and young people of parents with serious mental illness: randomised feasibility protocol [J] . Judith Gellatly, Penny Bee, Lina Gega, Trials . 2018,第1期

机译：以社区为基础的干预措施（Young SMILES），以改善患有严重精神疾病的父母的儿童和年轻人的健康相关生活质量：随机可行性方案
4. Improving Deep Generative Models with Randomized SMILES [C] . Josep Arus-Pous, Simon Johansson, Oleksii Prykhodko, International conference on artificial neural networks . 2019

机译：改善随机微笑的深度生成模型
5. Deep Generative Models for Stochastic Modeling of Multivariate Sequential Data [D] . Liu, Yingru. 2021

机译：多变量顺序数据随机建模的深生成模型
6. Randomized SMILES strings improve the quality of molecular generative models [O] . Josep Arús-Pous, Simon Viet Johansson, Oleksii Prykhodko, 2019

机译：随机的SMILES琴弦提高了分子生成模型的质量
7. Improving Deep Generative Models with Randomized SMILES [O] . Josep Arús-Pous, Simon Johansson, Oleksii Prykhodko, 2019

机译：改善随机微笑的深度生成模型

Improving Deep Generative Models with Randomized SMILES

摘要

著录项

相似文献

相关主题

期刊订阅