...
首页> 外文期刊>Database >Prot2HG: a database of protein domains mapped to the human genome
【24h】

Prot2HG: a database of protein domains mapped to the human genome

机译:prot2hg:映射到人类基因组的蛋白质结构域的数据库

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Genetic variation occurring within conserved functional protein domains warrants special attention when examining DNA variation in the context of disease causation. Here we introduce a resource, freely available at www.prot2hg.com, that addresses the question of whether a particular variant falls onto an annotated protein domain and directly translates chromosomal coordinates onto protein residues. The tool can perform a multiple-site query in a simple way, and the whole dataset is available for download as well as incorporated into our own accessible pipeline. To create this resource, National Center for Biotechnology Information protein data were retrieved using the Entrez Programming Utilities. After processing all human protein domains, residue positions were reverse translated and mapped to the reference genome hg19 and stored in a MySQL database. In total, 760?487 protein domains from 42?371 protein models were mapped to hg19 coordinates and made publicly available for search or download (www.prot2hg.com). In addition, this annotation was implemented into the genomics research platform GENESIS in order to query nearly 8000 exomes and genomes of families with rare Mendelian disorders (tgp-foundation.org). When applied to patient genetic data, we found that rare (1%) variants in the Genome Aggregation Database were significantly more annotated onto a protein domain in comparison to common (1%) variants. Similarly, variants described as pathogenic or likely pathogenic in ClinVar were more likely to be annotated onto a domain. In addition, we tested a dataset consisting of 60 causal variants in a cohort of patients with epileptic encephalopathy and found that 71% of them (43 variants) were propagated onto protein domains. In summary, we developed a resource that annotates variants in the coding part of the genome onto conserved protein domains in order to increase variant prioritization efficiency.Database URL:www.prot2hg.com
机译:在保守的功能蛋白域内发生的遗传变异在检查疾病因果关系的背景下检查DNA变异时担保。在这里,我们在www.prot2hg.com上介绍一个资源,它在www.prot2hg.com上解决了特定变体是否落在注释的蛋白质结构域上,并直接将染色体坐标转化为蛋白质残留物。该工具可以以简单的方式执行多站点查询,并且整个数据集可用于下载,并结合到我们自己的可访问管道中。为创建此资源,使用Entrez编程实用程序检索国家生物技术信息蛋白质数据中心。在处理所有人类蛋白质结构域后,残留位置被反转平移并映射到参考基因组HG19并储存在MySQL数据库中。总共,760?487个蛋白质结构域从42?371蛋白模型被映射到HG19坐标,并公开可用于搜索或下载(www.prot2hg.com)。此外,该注释是在基因组学研究平台的成因中实施,以查询近8000名罕见的孟德尔疾病(TGP-Foundation.org)的家庭的突出和基因组。当应用于患者遗传数据时,我们发现与常见的(> 1%)变体相比,基因组聚集数据库中的罕见(<1%)变体显着提示到蛋白质结构域上。类似地,在ClinVAR中描述为致病或可能致病的变体更可能被注释到域上。此外,我们测试了由癫痫脑病患者队列中的60种因果变体组成的数据集,发现将71%(43%)繁殖到蛋白质结构域中。总之,我们开发了一种资源,该资源将基因组的编码部分中的变体注释到保守的蛋白质域中,以提高变体优先级效率.Database URL:www.prot2hg.com

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号