...
首页> 外文期刊>Journal of chemical information and modeling >Evolving interpretable structure - Activity relationship models. 2. Using multiobjective optimization to derive multiple models
【24h】

Evolving interpretable structure - Activity relationship models. 2. Using multiobjective optimization to derive multiple models

机译:不断发展的可解释结构-活动关系模型。 2.使用多目标优化得出多个模型

获取原文
获取原文并翻译 | 示例
           

摘要

A multiobjective evolutionary algorithm (MOEA) is described for evolving multiple structure-activity relationships (SARs). The SARs are encoded in easy-to-interpret reduced graph queries which describe features that are preferentially present in active compounds compared to inactives. The MOEA addresses a limitation associated with many machine learning methods; that is, the inherent tradeoff that exists in recall and precision which is usually handled by combining the two objectives into a single measure with a consequent loss of control. By simultaneously optimizing recall and precision, the MOEA generates a family of SARs that lie on the precision-recall (PR) curve. The user is then able to select a query with an appropriate balance in the two objectives: for example, a low recall-high precision query may be preferred when establishing the SAR, whereas a high recall-low precision query may be more appropriate in a virtual screening context. Each query on the PR curve aims at capturing the structure -activity information into a single representation, and each can be considered as an alternative (equally valid) solution. We then investigate combining individual queries into teams with the aim of capturing multiple SARs that may exist in a data set, for example, as is commonly seen in high-throughput screening data sets. Team formation is carried out iteratively as a postprocessing step following the evolution of the individual queries. The inclusion of uniqueness as a third objective within the MOEA provides an effective way of ensuring the queries are complementary in the active compounds they describe. Substantial improvements in both recall and precision are seen for some data sets. Furthermore, the resulting queries provide more detailed structure-activity information than is present in a single query.
机译:描述了一种用于演化多个结构-活动关系(SAR)的多目标进化算法(MOEA)。 SAR在易于理解的简化图形查询中进行了编码,这些查询描述了与非活性物质相比优先存在于活性化合物中的特征。 MOEA解决了与许多机器学习方法相关的局限性。也就是说,召回和精确度之间存在固有的权衡取舍,通常通过将两个目标合并为一个度量来控制,从而失去控制。通过同时优化查全率和查准率,MOEA生成了一系列位于查准率(PR)曲线上的SAR。然后,用户可以选择在两个目标之间具有适当平衡的查询:例如,在建立SAR时,首选低查全率,高精度查询,而在查明SAR时,高查全率,低精度查询可能更合适。虚拟筛选上下文。 PR曲线上的每个查询都旨在将结构活动信息捕获为单个表示形式,并且每个查询都可以视为替代(等效)解决方案。然后,我们调查将各个查询组合到团队中的目的,以捕获数据集中可能存在的多个SAR,例如,在高通量筛选数据集中通常会看到这种情况。随着各个查询的发展,团队的形成是作为后处理步骤迭代进行的。将独特性作为MOEA的第三个目标提供了一种有效的方式,以确保查询在它们描述的活性化合物中是互补的。对于某些数据集,召回率和精度都得到了显着提高。此外,结果查询提供的结构活动信息比单个查询中提供的信息更为详细。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号