...
首页> 外文期刊>Biostatistics >Redefining CpG islands using hidden Markov models.
【24h】

Redefining CpG islands using hidden Markov models.

机译:使用隐藏的马尔可夫模型重新定义CpG岛。

获取原文
获取原文并翻译 | 示例
           

摘要

The DNA of most vertebrates is depleted in CpG dinucleotide: a C followed by a G in the 5' to 3' direction. CpGs are the target for DNA methylation, a chemical modification of cytosine (C) heritable during cell division and the most well-characterized epigenetic mechanism. The remaining CpGs tend to cluster in regions referred to as CpG islands (CGI). Knowing CGI locations is important because they mark functionally relevant epigenetic loci in development and disease. For various mammals, including human, a readily available and widely used list of CGI is available from the UCSC Genome Browser. This list was derived using algorithms that search for regions satisfying a definition of CGI proposed by Gardiner-Garden and Frommer more than 20 years ago. Recent findings, enabled by advances in technology that permit direct measurement of epigenetic endpoints at a whole-genome scale, motivate the need to adapt the current CGI definition. In this paper, we propose a procedure, guided by hidden Markov models, that permits an extensible approach to detecting CGI. The main advantage of our approach over others is that it summarizes the evidence for CGI status as probability scores. This provides flexibility in the definition of a CGI and facilitates the creation of CGI lists for other species. The utility of this approach is demonstrated by generating the first CGI lists for invertebrates, and the fact that we can create CGI lists that substantially increases overlap with recently discovered epigenetic marks. A CGI list and the probability scores, as a function of genome location, for each species are available at http://www.rafalab.org.
机译:大多数脊椎动物的DNA中都缺少CpG二核苷酸:在5'至3'方向上是C,然后是G。 CpGs是DNA甲基化的靶标,DNA甲基化是细胞分裂过程中可遗传的胞嘧啶(C)的化学修饰,并且是表征最充分的表观遗传机制。其余的CpG倾向于聚集在称为CpG岛(CGI)的区域中。知道CGI位置很重要,因为它们会标记发育和疾病中功能相关的表观遗传位点。对于包括人类在内的各种哺乳动物,可从UCSC基因组浏览器中获得易于使用和广泛使用的CGI列表。该列表是使用20年前由Gardiner-Garden和Frommer提出的,搜索满足CGI定义的区域的算法得出的。技术的进步使得可以在全基因组规模上直接测量表观遗传学终点,从而获得了最新发现,从而激发了适应当前CGI定义的需求。在本文中,我们提出了一个在隐马尔可夫模型的指导下的程序,该程序允许使用可扩展的方法来检测CGI。我们的方法相对于其他方法的主要优点是,它以概率得分的形式总结了CGI状态的证据。这为CGI的定义提供了灵活性,并有助于创建其他物种的CGI清单。通过为无脊椎动物生成第一个CGI列表,以及我们可以创建与最近发现的表观遗传标记大大增加重叠的CGI列表这一事实,证明了该方法的实用性。有关每种物种的CGI列表和概率得分(随基因组位置而定)可在http://www.rafalab.org上获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号