首页> 外文期刊>Proteins: Structure, Function, and Genetics >Ensembling multiple raw coevolutionary features with deep residual neural networks for contact‐map prediction in CASP13
【24h】

Ensembling multiple raw coevolutionary features with deep residual neural networks for contact‐map prediction in CASP13

机译:在CASP13中合并具有深度残余神经网络的多重原始共轭特征,在CASP13中进行联系地图预测

获取原文
获取原文并翻译 | 示例
       

摘要

Abstract We report the results of residue‐residue contact prediction of a new pipeline built purely on the learning of coevolutionary features in the CASP13 experiment. For a query sequence, the pipeline starts with the collection of multiple sequence alignments (MSAs) from multiple genome and metagenome sequence databases using two complementary Hidden Markov Model (HMM)‐based searching tools. Three profile matrices, built on covariance, precision, and pseudolikelihood maximization respectively, are then created from the MSAs, which are used as the input features of a deep residual convolutional neural network architecture for contact‐map training and prediction. Two ensembling strategies have been proposed to integrate the matrix features through end‐to‐end training and stacking, resulting in two complementary programs called TripletRes and ResTriplet, respectively. For the 31 free‐modeling domains that do not have homologous templates in the PDB, TripletRes and ResTriplet generated comparable results with an average accuracy of 0.640 and 0.646, respectively, for the top L /5 long‐range predictions, where 71% and 74% of the cases have an accuracy above 0.5. Detailed data analyses showed that the strength of the pipeline is due to the sensitive MSA construction and the advanced strategies for coevolutionary feature ensembling. Domain splitting was also found to help enhance the contact prediction performance. Nevertheless, contact models for tail regions, which often involve a high number of alignment gaps, and for targets with few homologous sequences are still suboptimal. Development of new approaches where the model is specifically trained on these regions and targets might help address these problems.
机译:摘要我们报告了纯粹在Casp13实验中纯粹建造了新管道的残留残留接触预测结果。对于查询序列,流水线从多个基因组和MSAS中的集合开始,使用两个互补的隐马尔可夫模型(HMM)进行了多个基因组和METAGENOME序列数据库。然后,三个简矩案分别基于协方差,精度和假脉最大化,从MSA创建,该MSA被用作用于联系地图训练和预测的深度残余卷积神经网络架构的输入特征。已经提出了两种合并策略来集成矩阵特征,通过端到端的培训和堆叠集成,导致分别称为三重级和延迟的两个互补程序。对于在PDB中没有同源模板的31个自由建模结构域,三重级和延迟产生的可比结果分别为0.640和0.646的平均精度,用于顶部L / 5远程预测,其中71%和74占案例的百分比高于0.5。详细的数据分析表明,管道的强度是由于MSA构建敏感的敏感功能和合作功能合奏的先进策略。还发现域分离有助于提高接触预测性能。尽管如此,尾部区域的接触型号通常涉及大量的对准间隙,并且对于少数同源序列的靶仍然是次优。开发模型在这些地区和目标上专门培训的新方法可能有助于解决这些问题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号