首页> 外文期刊>Bioinformatics >FDM: a graph-based statistical method to detect differential transcription using RNA-seq data
【24h】

FDM: a graph-based statistical method to detect differential transcription using RNA-seq data

机译:FDM:使用RNA序列数据检测差异转录的基于图的统计方法

获取原文
获取原文并翻译 | 示例
           

摘要

Motivation: In eukaryotic cells, alternative splicing expands the diversity of RNA transcripts and plays an important role in tissue-specific differentiation, and can be misregulated in disease. To understand these processes, there is a great need for methods to detect differential transcription between samples. Our focus is on samples observed using short-read RNA sequencing (RNA-seq).Methods: We characterize differential transcription between two samples as the difference in the relative abundance of the transcript isoforms present in the samples. The magnitude of differential transcription of a gene between two samples can be measured by the square root of the Jensen Shannon Divergence (JSD*) between the gene's transcript abundance vectors in each sample. We define a weighted splice-graph representation of RNA-seq data, summarizing in compact form the alignment of RNA-seq reads to a reference genome. The flow difference metric (FDM) identifies regions of differential RNA transcript expression between pairs of splice graphs, without need for an underlying gene model or catalog of transcripts. We present a novel non-parametric statistical test between splice graphs to assess the significance of differential transcription, and extend it to group-wise comparison incorporating sample replicates.Results: Using simulated RNA-seq data consisting of four technical replicates of two samples with varying transcription between genes, we show that (i) the FDM is highly correlated with JSD* (r=0.82) when average RNA-seq coverage of the transcripts is sufficiently deep; and (ii) the FDM is able to identify 90% of genes with differential transcription when JSD* > 0.28 and coverage > 7. This represents higher sensitivity than Cufflinks (without annotations) and rDiff (MMD), which respectively identified 69 and 49% of the genes in this region as differential transcribed. Using annotations identifying the transcripts, Cufflinks was able to identify 86% of the genes in this region as differentially transcribed. Using experimental data consisting of four replicates each for two cancer cell lines (MCF7 and SUM102), FDM identified 1425 genes as significantly different in transcription. Subsequent study of the samples using quantitative real time polymerase chain reaction (qRT-PCR) of several differential transcription sites identified by FDM, confirmed significant differences at these sites.
机译:动机:在真核细胞中,选择性剪接扩展了RNA转录本的多样性,并在组织特异性分化中起着重要作用,并且在疾病中可能被错误调节。为了理解这些过程,非常需要检测样品之间差异转录的方法。我们的重点是使用短读RNA测序(RNA-seq)观察到的样品。方法:我们将两个样品之间的差异转录表征为样品中存在的转录异构体的相对丰度差异。两个样本之间基因差异转录的大小可以通过每个样本中基因转录本丰度载体之间的詹森·香农散度(JSD *)的平方根来衡量。我们定义了RNA-seq数据的加权剪接图表示,以紧凑的形式总结了RNA-seq读数与参考基因组的比对。流差异度量(FDM)可以识别成对的剪接图之间差异RNA转录本表达的区域,而无需基础的基因模型或转录本目录。我们在剪接图之间提出了一种新颖的非参数统计检验,以评估差异转录的重要性,并将其扩展到结合样本重复的逐组比较结果。结果:使用包含两个样本的四个技术重复的模拟RNA序列数据基因之间的转录,我们发现(i)当转录本的平均RNA-seq覆盖深度足够深时,FDM与JSD *(r = 0.82)高度相关; (ii)当JSD *> 0.28和coverage> 7时,FDM能够识别90%的差异转录基因。这比Cufflinks(无注释)和rDiff(MMD)分别识别69%和49%的敏感性更高。该区域中基因的差异转录。使用注释识别转录本,Cufflinks能够鉴定该区域中86%的基因被差异转录。利用由两个癌细胞系(MCF7和SUM102)各有四个重复的实验数据,FDM确定了1425个基因在转录上有显着差异。随后使用定量实时聚合酶链反应(qRT-PCR)对由FDM鉴定的几个差异转录位点进行的样品研究,证实了这些位点的显着差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号