首页> 美国卫生研究院文献>BMC Bioinformatics >HPG-DHunter: an ultrafast friendly tool for DMR detection and visualization
【2h】

HPG-DHunter: an ultrafast friendly tool for DMR detection and visualization

机译:HPG-DHUNTER:超快友好的DMR检测和可视化工具

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Bisulphite-treated DNA methylation analysis requires specific treatment of DNA that modifies its sequence, as well as software tools for its analysis. Bisulphite treatment converts unmethylated cytosines (Cs) into thymines (Ts), which gives rise to C-to-T changes in DNA sequence after sequencing, while leaving methylated cytosines (5mCs) unchanged. By aligning and comparing bisulphite sequencing reads to the genomic DNA sequence, it is possible to infer DNA methylation patterns at base pair-resolution ([1]). Hydroxymethylated samples can also be obtained from different methods, the Ten-eleven translocation (TET) Assisted Bisulfite Sequencing (TAB-Seq) ([2, 3]), which produces Ts in methylated and unmethylated Cs and maintains as C the hydroxymethylated Cs (5hmC), and the oxidative bisulphate sequencing (oxBs-seq) ([4]). Different software tools have been proposed for DNA methylation analysis like RRBSMAP ([5]), the widely extended tool Bismark ([6]), or the most recent tools HPG-Methyl ([7, 8]). These tools provide single-base information of the alignment and the methylation status of each input sequence (or read). However, all these software tools yield the results as text files which usually have sizes of tens of Gigabytes, and follow the Sequence Alignment/Map (SAM) format or the Binary Alignment Map (BAM) format. Other tools like HPG-HMapper ([9, 10]) uses the methylation information of each base for each read present in the BAM files to build a DNA methylation map which gives information about the methylation level for each base of the reference genome. This map is yielded as one csv file for each chromosome in the species. Biomedical researches then have to compare the methylation level information in these files at different scales (DNA segments, CpG islands or coding regions, DNA chromosomes, etc.) besides the base-pair resolution, comparing also the results coming from different samples. Thus, many different tools for processing, displaying the methylation results and discovering differentially methylated regions (DMRs) have been proposed ([11–16]). However, most of these tools are based on statistical techniques, adding a high computational workload when applied to huge BAM or SAM files. As a result, the execution time required by these tools is large, adding an excessive delay from the moment of the sample extraction from the DNA sequencer until the moment when significant information is provided to the doctor or biomedical researcher. Also, the visualization of these data is far from being interactive. Another disadvantage of these tools is that many of them are essentially R scripts, which requires programming skills from the user. Finally, as some comparative study shows [17], these methods show more accurate results when identifying simulated DM regions that are long and have small within-group variation, but they have low concordance, probably due to the different approaches they have used for DM identification. Thus, they yield very low concordance when used with real data.
机译:亚硫酸氢盐处理的DNA甲基化分析,需要DNA的特定的治疗,修改其序列,以及软件工具对其进行分析。亚硫酸氢盐处理转换未甲基化的胞嘧啶(CS)为胸腺嘧啶(TS),其产生C-到-T测序后改变的DNA序列,同时使甲基化的胞嘧啶(5mCs)不变。通过比对和比较亚硫酸氢盐测序读数与基因组DNA序列,有可能推断出的DNA甲基化模式在碱基对分辨率([1])。羟甲基化的样品,也可以从不同的方法获得,所述十11易位(TET)辅助硫酸氢盐测序(TAB-SEQ)([2,3]),其产生的TS中甲基化的和未甲基化的Cs和保持为C羟甲基化的CS( 5hmC的),和氧化硫酸氢盐测序(oxBs-SEQ)([4])。不同的软件工具已被提出用于像RRBSMAP DNA甲基化分析([5])中,广泛地扩展工具俾斯麦([6]),或最近的工具HPG甲基([7,8])。这些工具提供的对准的单碱基的信息和每个输入序列(或读出)的甲基化状态。然而,所有这些软件工具得到的结果通常有数十GB的大小,并按照序列比对/图(SAM)格式或二进制对齐地图(BAM)格式的文本文件。像HPG-HMapper其它工具([9,10])使用每个基站的每个中存在的BAM文件读构建DNA甲基化地图,其提供了有关的参照基因组的每个碱基的甲基化水平的信息的甲基化的信息。该地图产生作为在种类的各染色体一个CSV文件。生物医学研究然后必须比较除了碱基对分辨率在不同尺度这些文件中的甲基化水平的信息(DNA片段,CpG岛或编码区,DNA染色体等),还比较来自不同样品来的结果。因此,用于处理,显示甲基化的结果,发现差异甲基化区域(DMRS)许多不同的工具已经被提出([11-16])。然而,大多数的这些工具都是基于统计技术,当应用到巨大的BAM或SAM文件添加高计算工作量。其结果是,由这些工具所需要的执行时间是大的,添加从由DNA测序仪,直到时刻样本提取的时刻过度延迟时显著信息被提供给医生或生物医学研究者。此外,这些数据的可视化还远远没有互动。这些工具的另一个缺点是,其中不少是基本上由R脚本,需要用户编程技巧。最后,如一些对比研究表明[17],这些方法表现出更精确的结果识别长而具有小的组内变化模拟DM的区域的情况下,但它们具有低的一致性,可能是由于它们已经用于DM的不同方法鉴别。因此,用真实的数据使用时产生非常低的一致性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号