首页> 外文学位 >Towards a computational approach for the identification of functional regulatory genetic polymorphism.
【24h】

Towards a computational approach for the identification of functional regulatory genetic polymorphism.

机译:迈向识别功能性调控遗传多态性的计算方法。

获取原文
获取原文并翻译 | 示例

摘要

Distinguishing functional, non-coding, regulatory single-nucleotide polymorphisms (SNPs) from among the millions of known non-functional variants is challenging. Efforts to develop functional sequence annotation such as the ENCODE project may enable development of more accurate prediction algorithms, yet it remains unclear which forms of sequence annotation will be most useful for identifying functional variants. Using results from three genome-wide expression trait mapping studies and 12 genome sequence annotation tracks, we assessed whether or not functional DNA elements are enriched for cis-acting SNPs. We found significant enrichment for most functional tracks tested, with greatest enrichment in CpG islands, RNA Pol II sites, DNase hypersensitivity sites, histone modification sites, and microRNA target sites. In contrast, enrichment was inconsistent when relying only on sequence conservation or transcription-factor binding-site conservation prediction tracks. These data suggest that functional sequence annotation may facilitate more efficient functional SNP selection for genetic association studies. We took advantage of these findings and built a Random Forests (RF) model to predict functional regulatory SNPs (rSNPs) with 38 SNP attributes including 22 histone profiling tracks, distances to the transcription start and stop sites of the closest gene, genic region relative to the closest gene, and sequence features such as minor allele frequency (MAF) and transition/transversion mutation types. The RF model far outperforms individual attributes in identifying cis-acting SNPs in terms of sensitivity and specificity, and the significant SNPs predicted to be functional are about 4 times as likely to replicate across independent studies as those significant SNPs predicted to be non-functional. The data suggest that identification of functional variation using sequence data is feasible and can be a useful means of identifying candidate variants for disease association studies.
机译:从数百万个已知的非功能性变体中区分功能性,非编码性,调节性单核苷酸多态性(SNP)是具有挑战性的。诸如ENCODE项目之类的开发功能序列注释的工作可能使开发更准确的预测算法成为可能,但目前尚不清楚哪种形式的序列注释将最有助于识别功能变体。使用来自三项全基因组表达特征作图研究和12条基因组序列注释轨迹的结果,我们评估了功能性DNA元件是否富含顺式作用SNP。我们发现,对于测试的大多数功能轨迹,都具有显着的富集作用,其中CpG岛,RNA Pol II位点,DNase超敏性位点,组蛋白修饰位点和microRNA靶位点具有最大的富集。相反,仅依靠序列保守或转录因子结合位点保守预测轨迹时,富集是不一致的。这些数据表明功能序列注释可以促进遗传关联研究中更有效的功能SNP选择。我们利用这些发现,建立了一个随机森林(RF)模型,以预测具有38个SNP属性的功能性调节SNP(rSNP),包括22个组蛋白分布轨迹,到最近基因的转录起始和终止位点的距离,相对于最接近的基因,以及序列特征,例如次要等位基因频率(MAF)和过渡/转化突变类型。在识别顺式作用SNP的敏感性和特异性方面,RF模型的性能远远优于单个属性,在独立研究中,被预测为有功能的重要SNP的复制能力是被预测为无功能的那些重要SNP的大约4倍。数据表明,使用序列数据鉴定功能变异是可行的,并且可以是鉴定疾病关联研究候选变异的有用手段。

著录项

  • 作者

    Xu, Mousheng.;

  • 作者单位

    Boston University.;

  • 授予单位 Boston University.;
  • 学科 Biology Genetics.;Engineering Biomedical.;Biology Bioinformatics.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 80 p.
  • 总页数 80
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 遗传学;生物医学工程;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号