首页> 美国卫生研究院文献>Journal of Cheminformatics >KekuleScope: prediction of cancer cell line sensitivity and compound potency using convolutional neural networks trained on compound images
【2h】

KekuleScope: prediction of cancer cell line sensitivity and compound potency using convolutional neural networks trained on compound images

机译:KekuleScope:使用在复合图像上训练的卷积神经网络预测癌细胞系的敏感性和复合效能

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The application of convolutional neural networks (ConvNets) to harness high-content screening images or 2D compound representations is gaining increasing attention in drug discovery. However, existing applications often require large data sets for training, or sophisticated pretraining schemes. Here, we show using 33 IC50 data sets from ChEMBL 23 that the in vitro activity of compounds on cancer cell lines and protein targets can be accurately predicted on a continuous scale from their Kekulé structure representations alone by extending existing architectures (AlexNet, DenseNet-201, ResNet152 and VGG-19), which were pretrained on unrelated image data sets. We show that the predictive power of the generated models, which just require standard 2D compound representations as input, is comparable to that of Random Forest (RF) models and fully-connected Deep Neural Networks trained on circular (Morgan) fingerprints. Notably, including additional fully-connected layers further increases the predictive power of the ConvNets by up to 10%. Analysis of the predictions generated by RF models and ConvNets shows that by simply averaging the output of the RF models and ConvNets we obtain significantly lower errors in prediction for multiple data sets, although the effect size is small, than those obtained with either model alone, indicating that the features extracted by the convolutional layers of the ConvNets provide complementary predictive signal to Morgan fingerprints. Lastly, we show that multi-task ConvNets trained on compound images permit to model COX isoform selectivity on a continuous scale with errors in prediction comparable to the uncertainty of the data. Overall, in this work we present a set of ConvNet architectures for the prediction of compound activity from their Kekulé structure representations with state-of-the-art performance, that require no generation of compound descriptors or use of sophisticated image processing techniques. The code needed to reproduce the results presented in this study and all the data sets are provided at .
机译:卷积神经网络(ConvNets)用于利用高内涵筛选图像或2D复合表示形式的方法在药物开发中越来越受到关注。但是,现有的应用程序通常需要大量的数据集进行训练,或者需要复杂的预训练方案。在这里,我们显示了使用来自ChEMBL 23的33个IC50数据集,可以通过扩展现有体系结构,仅从其Kekulé结构表示形式就可以连续规模准确预测化合物对癌细胞系和蛋白质靶标的体外活性(AlexNet,DenseNet-201 ,ResNet152和VGG-19),它们已经在不相关的图像数据集上进行了预训练。我们显示,仅需要标准2D复合表示作为输入的生成模型的预测能力可与随机森林(RF)模型和经过圆形(Morgan)指纹训练的完全连接的深度神经网络相媲美。值得注意的是,包括附加的全连接层可进一步将ConvNets的预测能力提高10%。对RF模型和ConvNets生成的预测的分析表明,通过简单地平均RF模型和ConvNets的输出,我们可以得出多个数据集的预测误差要低得多,尽管效果的大小比单独使用任一模型获得的结果小,表示由卷积网络的卷积层提取的特征为Morgan指纹提供了互补的预测信号。最后,我们证明了在复合图像上训练的多任务ConvNets可以在连续规模上对COX亚型选择性建模,其预测误差可与数据的不确定性相比。总的来说,在这项工作中,我们提出了一套ConvNet架构,用于根据其具有最先进性能的Kekulé结构表示预测化合物活性,而无需生成化合物描述符或使用复杂的图像处理技术。重现本研究中提供的结果所需的代码和所有数据集在提供。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号