首页> 外文期刊>BMC Bioinformatics >RACS: rapid analysis of ChIP-Seq data for contig based genomes
【24h】

RACS: rapid analysis of ChIP-Seq data for contig based genomes

机译:RACS:对基于重叠群的基因组的ChIP-Seq数据进行快速分析

获取原文
           

摘要

Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely-used molecular method to investigate the function of chromatin-related proteins by identifying their associated DNA sequences on a genomic scale. ChIP-Seq generates large quantities of data that is difficult to process and analyze, particularly for organisms with a contig-based sequenced genomes that typically have minimal annotation on their associated set of genes other than their associated coordinates primarily predicted by gene finding programs. Poorly annotated genome sequence makes comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking. We present a one-stop computational pipeline, “Rapid Analysis of ChIP-Seq data” (RACS), that utilizes traditional High-Performance Computing (HPC) techniques in association with open source tools for processing and analyzing raw ChIP-Seq data. RACS is an open source computational pipeline available from any of the following repositories https://bitbucket.org/mjponce/RACS or https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACS . RACS is particularly useful for ChIP-Seq in organisms with contig-based genomes that have poor gene annotation to aid protein function discovery.To test the performance and efficiency of RACS, we analyzed ChIP-Seq data previously published in a model organism Tetrahymena thermophila which has a contig-based genome. We assessed the generality of RACS by analyzing a previously published data set generated using the model organism Oxytricha trifallax, whose genome sequence is also contig-based with poor annotation. The RACS computational pipeline presented in this report is an efficient and reliable tool to analyze genome-wide raw ChIP-Seq data generated in model organisms with poorly annotated contig-based genome sequence. Because RACS segregates the found read accumulations between genic and intergenic regions, it is particularly efficient for rapid downstream analyses of proteins involved in gene expression.
机译:染色质免疫沉淀与下一代测序(ChIP-Seq)结合是一种广泛使用的分子方法,通过在基因组规模上鉴定与染色质相关的蛋白质的相关DNA序列来研究其功能。 ChIP-Seq生成大量数据,这些数据难以处理和分析,尤其是对于具有基于重叠群的测序基因组的生物而言,通常其相关基因集上的注释很少,而不是主要由基因发现程序预测的相关坐标。注释不佳的基因组序列使得难以对ChIP-Seq数据进行全面分析,因此缺乏标准化的分析流程。我们提出了一个一站式的计算管道“ ChIP-Seq数据的快速分析”(RACS),它将传统的高性能计算(HPC)技术与开放源代码工具结合使用,以处理和分析原始ChIP-Seq数据。 RACS是可从以下任何存储库https://bitbucket.org/mjponce/RACS或https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACS获得的开源计算管道。对于基于重叠群的基因组且基因注释不佳的生物体中的ChIP-Seq而言,RACS尤其有用,它可以帮助发现蛋白质功能。为测试RACS的性能和效率,我们分析了先前在模型四生菌四生膜四肢菌中发表的ChIP-Seq数据。有一个基于重叠群的基因组。我们通过分析以前使用模型生物Oxytricha trifallax生成的数据集评估了RACS的普遍性,该数据集的基因组序列也是基于重叠群且注释不佳。本报告中介绍的RACS计算管道是一种有效且可靠的工具,可用于分析在基于重叠群的注释不充分的模型生物中生成的全基因组原始ChIP-Seq数据。由于RACS在基因和基因间区域之间隔离了发现的读码积累,因此对于快速下游分析涉及基因表达的蛋白质特别有效。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号