...
首页> 外文期刊>Journal of Translational Medicine >In silico prediction of novel therapeutic targets using gene–disease association data
【24h】

In silico prediction of novel therapeutic targets using gene–disease association data

机译:利用基因-疾病关联数据对新型治疗靶点进行计算机预测

获取原文

摘要

Background Target identification and validation is a pressing challenge in the pharmaceutical industry, with many of the programmes that fail for efficacy reasons showing poor association between the drug target and the disease. Computational prediction of successful targets could have a considerable impact on attrition rates in the drug discovery pipeline by significantly reducing the initial search space. Here, we explore whether gene–disease association data from the Open Targets platform is sufficient to predict therapeutic targets that are actively being pursued by pharmaceutical companies or are already on the market. Methods To test our hypothesis, we train four different classifiers (a random forest, a support vector machine, a neural network and a gradient boosting machine) on partially labelled data and evaluate their performance using nested cross-validation and testing on an independent set. We then select the best performing model and use it to make predictions on more than 15,000 genes. Finally, we validate our predictions by mining the scientific literature for proposed therapeutic targets. Results We observe that the data types with the best predictive power are animal models showing a disease-relevant phenotype, differential expression in diseased tissue and genetic association with the disease under investigation. On a test set, the neural network classifier achieves over 71% accuracy with an AUC of 0.76 when predicting therapeutic targets in a semi-supervised learning setting. We use this model to gain insights into current and failed programmes and to predict 1431 novel targets, of which a highly significant proportion has been independently proposed in the literature. Conclusions Our in silico approach shows that data linking genes and diseases is sufficient to predict novel therapeutic targets effectively and confirms that this type of evidence is essential for formulating or strengthening hypotheses in the target discovery process. Ultimately, more rapid and automated target prioritisation holds the potential to reduce both the costs and the development times associated with bringing new medicines to patients.
机译:背景技术靶标的识别和验证是制药行业面临的紧迫挑战,许多程序由于功效原因而失败,表明药物靶标与疾病之间的关联性很差。通过显着减少初始搜索空间,成功预测目标的计算预测可能会对药物发现管道中的损耗率产生重大影响。在这里,我们探索来自Open Targets平台的基因-疾病关联数据是否足以预测制药公司正在积极追求的或已经上市的治疗靶标。方法为了检验我们的假设,我们对部分标记的数据训练了四个不同的分类器(随机森林,支持向量机,神经网络和梯度提升机),并使用嵌套交叉验证和在独立集合上进行测试来评估其性能。然后,我们选择性能最好的模型,并用其对超过15,000个基因进行预测。最后,我们通过挖掘科学文献中提出的治疗靶点来验证我们的预测。结果我们观察到具有最佳预测能力的数据类型是动物模型,其显示与疾病相关的表型,在患病组织中的差异表达以及与所调查疾病的遗传关联。在测试集上,当在半监督学习设置中预测治疗目标时,神经网络分类器以0.76的AUC达到71%以上的准确性。我们使用此模型来洞察当前和失败的程序,并预测1431个新目标,其中很大比例的文献已独立提出。结论我们的计算机方法表明,将基因和疾病联系起来的数据足以有效地预测新的治疗靶标,并证实此类证据对于在靶标发现过程中提出或加强假设至关重要。最终,更快,更自动化的靶标优先级排序有可能降低与为患者带来新药相关的成本和开发时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号