首页> 外文期刊>BMC Bioinformatics >Predicting protein inter-residue contacts using composite likelihood maximization and deep learning
【24h】

Predicting protein inter-residue contacts using composite likelihood maximization and deep learning

机译:使用复合似然最大化和深度学习预测蛋白质残基间的接触

获取原文
           

摘要

Accurate prediction of inter-residue contacts of a protein is important to calculating its tertiary structure. Analysis of co-evolutionary events among residues has been proved effective in inferring inter-residue contacts. The Markov random field (MRF) technique, although being widely used for contact prediction, suffers from the following dilemma: the actual likelihood function of MRF is accurate but time-consuming to calculate; in contrast, approximations to the actual likelihood, say pseudo-likelihood, are efficient to calculate but inaccurate. Thus, how to achieve both accuracy and efficiency simultaneously remains a challenge. In this study, we present such an approach (called clmDCA) for contact prediction. Unlike plmDCA using pseudo-likelihood, i.e., the product of conditional probability of individual residues, our approach uses composite-likelihood, i.e., the product of conditional probability of all residue pairs. Composite likelihood has been theoretically proved as a better approximation to the actual likelihood function than pseudo-likelihood. Meanwhile, composite likelihood is still efficient to maximize, thus ensuring the efficiency of clmDCA. We present comprehensive experiments on popular benchmark datasets, including PSICOV dataset and CASP-11 dataset, to show that: i) clmDCA alone outperforms the existing MRF-based approaches in prediction accuracy. ii) When equipped with deep learning technique for refinement, the prediction accuracy of clmDCA was further significantly improved, suggesting the suitability of clmDCA for subsequent refinement procedure. We further present a successful application of the predicted contacts to accurately build tertiary structures for proteins in the PSICOV dataset. Composite likelihood maximization algorithm can efficiently estimate the parameters of Markov Random Fields and can improve the prediction accuracy of protein inter-residue contacts.
机译:准确预测蛋白质残基间的接触对于计算其三级结构很重要。事实证明,分析残基之间的协同进化事件可有效推断残基间的接触。马尔可夫随机场(MRF)技术虽然被广泛用于接触预测,但存在以下难题:MRF的实际似然函数是准确的,但计算耗时。相反,对实际可能性的近似值(例如伪似然率)可以有效地计算但不准确。因此,如何同时实现精度和效率仍然是一个挑战。在这项研究中,我们提出了一种用于联系预测的方法(称为clmDCA)。与使用伪可能性(即单个残基的条件概率的乘积)的plmDCA不同,我们的方法使用了复合可能性(即所有残基对的条件概率的乘积)。理论上已经证明了合成似然比伪似然更好地逼近实际似然函数。同时,复合似然仍然可以有效地最大化,从而确保clmDCA的效率。我们对流行的基准数据集(包括PSICOV数据集和CASP-11数据集)进行全面的实验,以表明:i)clmDCA在预测准确度方面优于现有的基于MRF的方法。 ii)配备深度学习技术进行精炼后,clmDCA的预测准确性进一步提高,表明clmDCA适用于后续精炼程序。我们进一步介绍了预测的接触的成功应用,以准确地建立PSICOV数据集中蛋白质的三级结构。复合似然最大化算法可以有效地估计马尔可夫随机场的参数,可以提高蛋白质残基间接触的预测精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号