首页> 外文期刊>The American Journal of Human Genetics >Comprehensive Analysis of Constraint on the Spatial Distribution of Missense Variants in Human Protein Structures
【24h】

Comprehensive Analysis of Constraint on the Spatial Distribution of Missense Variants in Human Protein Structures

机译:对人蛋白质结构中畸形变种空间分布的限制综合分析

获取原文
获取原文并翻译 | 示例
           

摘要

The spatial distribution of genetic variation within proteins is shaped by evolutionary constraint and provides insight into the functional importance of protein regions and the potential pathogenicity of protein alterations. Here, we comprehensively evaluate the 3D spatial patterns of human germline and somatic variation in 6,604 experimentally derived protein structures and 33,144 computationally derived homology models covering 77% of all human proteins. Using a systematic approach, we quantify differences in the spatial distributions of neutral germline variants, disease-causing germline variants, and recurrent somatic variants. Neutral missense variants exhibit a general trend toward spatial dispersion, which is driven by constraint on core residues. In contrast, germline disease-causing variants are generally clustered in protein structures and form clusters more frequently than recurrent somatic variants identified from tumor sequencing. In total, we identify 215 proteins with significant spatial constraints on the distribution of disease-causing missense variants in experimentally derived protein structures, only 65 (30%) of which have been previously reported. This analysis identifies many clusters not detectable from sequence information alone; only 12% of proteins with significant clustering in 3D were identified from similar analyses of linear protein sequence. Furthermore, spatial analyses of mutations in homology-based structural models are highly correlated with those from experimentally derived structures, supporting the use of computationally derived models. Our approach highlights significant differences in the spatial constraints on different classes of mutations in protein structure and identifies regions of potential function within individual proteins.
机译:蛋白质内遗传变异的空间分布是通过进化约束的形状,并对蛋白质区域的功能重要性以及蛋白质改变的潜在致病性提供了洞察。在这里,我们全面评估了6,604实验衍生的蛋白质结构的人种系列和体细胞变异的3D空间模式,以及占据所有人类蛋白质的77%的33,144个计算得出的同源模型。使用系统方法,我们量化中性种系变体,致病种系变体和复发体细胞变异的空间分布差异。中性密码变体表现出朝向空间分散的一般趋势,这是由核心残留物的约束驱动的。相反,种系疾病导用变体通常在蛋白质结构中聚集在蛋白质结构中,并且比从肿瘤测序鉴定的复发体体变体更频繁地形成簇。总共鉴定了215个蛋白质,其对实验衍生的蛋白质结构中的疾病导致的致畸畸形变体分布具有显着的空间限制,其中仅报道了65(30%)。该分析识别许多不可检测的许多群集单独的序列信息;仅从类似的线性蛋白质序列的类似分析中鉴定了3D中具有显着聚类的蛋白质的12%。此外,基于同源性的结构模型中的突变的空间分析与来自实验导出的结构的结构高度相关,支持使用计算衍生的模型。我们的方法突出了蛋白质结构不同类别突变的空间限制的显着差异,并识别单个蛋白质内的潜在功能区域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号