首页> 外文学位 >A geometric framework for robust nearest neighbor analysis of protein structure and function.
【24h】

A geometric framework for robust nearest neighbor analysis of protein structure and function.

机译:用于蛋白质结构和功能的稳健最近邻分析的几何框架。

获取原文
获取原文并翻译 | 示例

摘要

A protein is a long chain of amino acids, also called residues. In solution the protein chain folds into a compact three-dimensional shape that determines the protein's function. Nearest-neighbor analysis of protein structures, represented as one point per residue, identifies pairs, triples and quadruples of residues that interact or pack together. Such analysis has been used to score protein packing and interactions; to detect repeating elements of protein structure; to compare two protein structures; and to find packing patterns in families of proteins that are related to their function. However, point coordinates of protein structures are determined experimentally and are thus imprecise. I explore whether nearest-neighbor analysis done on precise points still applies for imprecise points.;My dissertation introduces two new geometric techniques for robust neighbor analysis, almost-Delaunay simplices and Delaunay probability, that capture imprecision in the input points, and demonstrates several applications in the analysis of protein structure. The almost-Delaunay simplices quantify possible changes in the nearest neighbors, given the maximum motion allowed for any point. For 3D points, they define new sets of neighboring pairs, triples and quadruples that may arise, called the almost-Delaunay edges, triangles and tetrahedra. The Delaunay probability estimates the probability that a set of points really are nearest neighbors, given the expected amplitude of random motion for all points. These techniques establish a framework in which existing applications of nearest-neighbor analysis can often be adapted to make them more robust for imprecise points, and entirely new applications can be designed that were not possible previously.;Using the almost-Delaunay tetrahedra, I observe that the nearest neighbors in a protein structure are more stable than in other protein-like structures, such as artificially folded decoys and structures predicted from protein sequences. I adapt a statistical score for protein packing to use my geometric framework, and show that the score is robust when used to distinguish well-packed proteins from decoys, and that the framework may make it more robust when analyzing the packing at each residue. I identify packing signatures for repeating elements of protein structure, particularly for alpha-helices, and detect these elements with high accuracy. Finally, I use changes in the neighboring residues between two or more snapshots of a protein undergoing motion to identify flexible residues.;Using the almost-Delaunay edges, I derive a sparse and robust graph representation of protein structure to support mining frequent substructures from protein families. From these I identify fingerprints, specific substructures that characterize protein families, and use them to infer the function of protein structures with unknown function.
机译:蛋白质是氨基酸的长链,也称为残基。在溶液中,蛋白质链折叠成紧凑的三维形状,决定了蛋白质的功能。对蛋白质结构的最近邻分析(表示为每个残基一个点)可识别相互作用或堆积在一起的残基对,三倍体和四倍体。这种分析已用于对蛋白质包装和相互作用进行评分。检测蛋白质结构的重复元件;比较两种蛋白质结构;并寻找与其功能相关的蛋白质家族的包装模式。但是,蛋白质结构的点坐标是通过实验确定的,因此不精确。我探究了在精确点上进行的最近邻分析是否仍适用于不精确点。我的论文介绍了两种用于健壮邻居分析的新几何技术,即几乎Delaunay单纯形和Delaunay概率,它们捕获了输入点中的不精确性,并演示了几种应用在蛋白质结构分析中。给定任何点允许的最大运动量,几乎是德劳内的简单变量可以量化最近邻的可能变化。对于3D点,它们定义了可能出现的相邻对,三元组和四元组的新集合,称为近德劳奈边缘,三角形和四面体。给定所有点的预期随机运动幅度,Delaunay概率可估计一组点实际上是最邻近点的概率。这些技术建立了一个框架,在该框架中,经常可以对最近邻分析的现有应用程序进行调整,以使其对于不精确的点更加健壮,并且可以设计以前不可能实现的全新应用程序。蛋白质结构中最接近的邻居比其他蛋白质样结构(例如人工折叠的诱饵和根据蛋白质序列预测的结构)更稳定。我调整了蛋白质包装的统计得分,以使用我的几何框架,并表明该得分在用于区分包装良好的蛋白质和诱饵时非常可靠,并且当分析每个残基的包装时,该框架可能会使其更加可靠。我确定了蛋白质结构重复元素的包装特征,特别是对于α-螺旋,并以高精度检测了这些元素。最后,我使用经历运动的蛋白质的两个或多个快照之间相邻残基的变化来识别柔性残基;使用几乎是Deelaunay的边缘,我得出了稀疏而健壮的蛋白质结构图表示形式,以支持从蛋白质中挖掘频繁的子结构家庭。从这些中,我识别出指纹,表征蛋白质家族的特定亚结构,并使用它们来推断功能未知的蛋白质结构的功能。

著录项

  • 作者

    Bandyopadhyay, Deepak.;

  • 作者单位

    The University of North Carolina at Chapel Hill.;

  • 授予单位 The University of North Carolina at Chapel Hill.;
  • 学科 Biology Molecular.;Computer Science.
  • 学位 Ph.D.
  • 年度 2006
  • 页码 365 p.
  • 总页数 365
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号