首页> 外文学位 >Large margin methods for partner specific prediction of interfaces in protein complexes.
【24h】

Large margin methods for partner specific prediction of interfaces in protein complexes.

机译:用于蛋白质复合物界面的伴侣特异性预测的大幅度方法。

获取原文
获取原文并翻译 | 示例

摘要

The study of protein interfaces and binding sites is a very important domain of research in bioinformatics. Information about the interfaces between proteins can be used not only in understanding protein function but can also be directly employed in drug design and protein engineering. However, the experimental determination of protein interfaces is cumbersome, expensive and not possible in some cases with today's technology. As a consequence, the computational prediction of protein interfaces from sequence and structure has emerged as a very active research area. A number of machine learning based techniques have been proposed for the solution to this problem. However, the prediction accuracy of most such schemes is very low.;In this dissertation we present large-margin classification approaches that have been designed to directly model different aspects of protein complex formation as well as the characteristics of available data. Most existing machine learning techniques for this task are partner-independent in nature, i.e., they ignore the fact that the binding propensity of a protein to bind to another protein is dependent upon characteristics of residues in both proteins. We have developed a pairwise support vector machine classifier called PAIRpred to predict protein interfaces in a partner-specific fashion. Due to its more detailed model of the problem, PAIRpred offers state of the art accuracy in predicting both binding sites at the protein level as well as inter-protein residue contacts at the complex level. PAIRpred uses sequence and structure conservation, local structural similarity and surface geometry, residue solvent exposure and template based features derived from the unbound structures of proteins forming a protein complex. We have investigated the impact of explicitly modeling the inter-dependencies between residues that are imposed by the overall structure of a protein during the formation of a protein complex through transductive and semi-supervised learning models. We also present a novel multiple instance learning scheme called MI-1 that explicitly models imprecision in sequence-level annotations of binding sites in proteins that bind calmodulin to achieve state of the art prediction accuracy for this task.
机译:蛋白质界面和结合位点的研究是生物信息学中非常重要的研究领域。关于蛋白质之间的界面的信息不仅可以用于理解蛋白质功能,还可以直接用于药物设计和蛋白质工程中。但是,蛋白质界面的实验确定麻烦,昂贵,并且在当今的技术中某些情况下是不可能的。结果,从序列和结构的蛋白质界面的计算预测已经成为非常活跃的研究领域。已经提出了许多基于机器学习的技术来解决该问题。然而,大多数此类方案的预测准确性非常低。;本文中,我们提出了大幅度的分类方法,这些方法旨在直接对蛋白质复合物形成的不同方面以及可用数据的特征进行建模。用于该任务的大多数现有机器学习技术本质上是独立于伙伴的,即,它们忽略了蛋白质与另一蛋白质结合的结合倾向取决于两种蛋白质中残基的特征这一事实。我们已经开发了称为PAIRpred的成对支持向量机分类器,可以以特定于合作伙伴的方式预测蛋白质界面。由于其更详细的问题模型,PAIRpred在预测蛋白质水平的结合位点和复杂水平的蛋白质间残基接触方面均提供了最新的准确性。 PAIRpred使用序列和结构保守性,局部结构相似性和表面几何形状,残留溶剂暴露以及基于模板的特征,这些特征源自形成蛋白复合物的未结合蛋白结构。我们研究了通过转导和半监督学习模型显式地建模蛋白质复合物形成过程中蛋白质整体结构所强加的残基之间的相互依赖性的影响。我们还提出了一种称为MI-1的新颖的多实例学习方案,该方案显式地对与钙调蛋白结合的蛋白质中结合位点的序列级注释进行不精确建模,以实现该任务的最新预测精度。

著录项

  • 作者单位

    Colorado State University.;

  • 授予单位 Colorado State University.;
  • 学科 Computer Science.;Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 162 p.
  • 总页数 162
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号