首页> 外文期刊>Journal of Cheminformatics >Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis
【24h】

Effects of multiple conformers per compound upon 3-D similarity search and bioassay data analysis

机译:每个化合物多个构象异构体对3-D相似性搜索和生物测定数据分析的影响

获取原文
       

摘要

Background To improve the utility of PubChem, a public repository containing biological activities of small molecules, the PubChem3D project adds computationally-derived three-dimensional (3-D) descriptions to the small-molecule records contained in the PubChem Compound database and provides various search and analysis tools that exploit 3-D molecular similarity. Therefore, the efficient use of PubChem3D resources requires an understanding of the statistical and biological meaning of computed 3-D molecular similarity scores between molecules. Results The present study investigated effects of employing multiple conformers per compound upon the 3-D similarity scores between ten thousand randomly selected biologically-tested compounds (10-K set) and between non-inactive compounds in a given biological assay (156-K set). When the “best-conformer-pair” approach, in which a 3-D similarity score between two compounds is represented by the greatest similarity score among all possible conformer pairs arising from a compound pair, was employed with ten diverse conformers per compound, the average 3-D similarity scores for the 10-K set increased by 0.11, 0.09, 0.15, 0.16, 0.07, and 0.18 for STST-opt, CTST-opt, ComboTST-opt, STCT-opt, CTCT-opt, and ComboTCT-opt, respectively, relative to the corresponding averages computed using a single conformer per compound. Interestingly, the best-conformer-pair approach also increased the average 3-D similarity scores for the non-inactive–non-inactive (NN) pairs for a given assay, by comparable amounts to those for the random compound pairs, although some assays showed a pronounced increase in the per-assay NN-pair 3-D similarity scores, compared to the average increase for the random compound pairs. Conclusion These results suggest that the use of ten diverse conformers per compound in PubChem bioassay data analysis using 3-D molecular similarity is not expected to increase the separation of non-inactive from random and inactive spaces “on average”, although some assays show a noticeable separation between the non-inactive and random spaces when multiple conformers are used for each compound. The present study is a critical next step to understand effects of conformational diversity of the molecules upon the 3-D molecular similarity and its application to biological activity data analysis in PubChem. The results of this study may be helpful to build search and analysis tools that exploit 3-D molecular similarity between compounds archived in PubChem and other molecular libraries in a more efficient way.
机译:背景技术为了提高公共化学数据库PubChem的实用性,该公共数据库包含小分子的生物活性,PubChem3D项目将计算派生的三维(3-D)描述添加到PubChem Compound数据库中包含的小分子记录中,并提供各种搜索和利用3-D分子相似性的分析工具。因此,有效使用PubChem3D资源需要了解分子之间计算的3-D分子相似性评分的统计和生物学意义。结果本研究调查了在给定的生物学测定中,对一万个随机选择的经过生物测试的化合物(10-K组)和非灭活化合物之间的3-D相似度得分,对每种化合物使用多个构象异构体的影响(156-K组) )。当采用“最佳构象异构体对”方法(其中由化合物对引起的所有可能的构象异构体对中两个化合物之间的3-D相似性得分由最大相似性得分表示)时,每个化合物具有十个不同的构象异构体,对于STST-opt,CTST-opt,ComboTST-opt,STCT-opt,CTCT-opt和ComboTCT-,10-K集的平均3-D相似性得分分别提高了0.11、0.09、0.15、0.16、0.07和0.18。分别相对于使用每个化合物使用一个构象异构体计算的相应平均值进行选择。有趣的是,最佳整合子对方法还增加了给定测定的非无效-非无效(NN)对的平均3-D相似度得分,与随机化合物对的平均3-D相似性得分相当,尽管有些测定与随机化合物对的平均增加相比,每个测定的NN对3-D相似性得分显着增加。结论这些结果表明,在使用3-D分子相似性的PubChem生物测定数据分析中,每个化合物使用十种不同构象异构体不会“平均”提高非无效区域与随机和无效区域的分离,尽管某些测定显示当每种化合物使用多个构象异构体时,非无效空间和随机空间之间的明显分隔。本研究是了解分子构象多样性对3-D分子相似性的影响及其在PubChem中生物活性数据分析中的应用的关键下一步。这项研究的结果可能有助于建立搜索和分析工具,以更有效的方式利用PubChem和其他分子库中存储的化合物之间的3-D分子相似性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号