首页> 美国卫生研究院文献>Genomics Data >Assessing quality standards for ChIP-seq and related massive parallel sequencing-generated datasets: When rating goes beyond avoiding the crisis
【2h】

Assessing quality standards for ChIP-seq and related massive parallel sequencing-generated datasets: When rating goes beyond avoiding the crisis

机译:评估ChIP-seq和相关的大规模并行测序生成的数据集的质量标准:当评级超出避免危机的程度时

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Massive parallel DNA sequencing combined with chromatin immunoprecipitation and a large variety of DNA/RNA-enrichment methodologies is at the origin of data resources of major importance. Indeed these resources, available for multiple genomes, represent the most comprehensive catalogue of (i) cell, development and signal transduction-specified patterns of binding sites for transcription factors (‘cistromes’) and for transcription and chromatin modifying machineries and (ii) the patterns of specific local post-translational modifications of histones and DNA (‘epigenome’) or of regulatory chromatin binding factors. In addition, (iii) the resources specifying chromatin structure alterations are emerging. Importantly, these types of “omics” datasets populate increasingly public repositories and provide highly valuable resources for the exploration of general principles of cell function in a multi-dimensional genome–transcriptome–epigenome–chromatin structure context. However, data mining is critically dependent on the data quality, an issue that, surprisingly, is still largely ignored by scientists and well-financed consortia, data repositories and scientific journals. So what determines the quality of ChIP-seq experiments and the datasets generated therefrom and what refrains scientists from associating quality criteria to their data? In this ‘opinion’ we trace the various parameters that influence the quality of this type of datasets, as well as the computational efforts that were made until now to qualify them. Moreover, we describe a universal quality control (QC) certification approach that provides a quality rating for ChIP-seq and enrichment-related assays. The corresponding QC tool and a regularly updated database, from which at present the quality parameters of more than 8000 datasets can be retrieved, are freely accessible at .
机译:大规模并行DNA测序结合染色质免疫沉淀法和多种DNA / RNA富集方法是最重要的数据资源的起源。实际上,这些可用于多个基因组的资源代表了以下方面的最全面的目录:(i)细胞,发育和信号传导指定的转录因子结合位点模式(“ cistromes”)以及转录和染色质修饰机制,以及(ii)组蛋白和DNA(“表观基因组”)或染色质调节结合因子的特定局部翻译后修饰的模式。此外,(iii)出现了指定染色质结构改变的资源。重要的是,这些类型的“组学”数据集越来越多地出现在公共存储库中,并为在多维基因组-转录组-表观基因组-染色质结构背景下探索细胞功能的一般原理提供了极有价值的资源。但是,数据挖掘在很大程度上取决于数据质量,令人惊讶的是,科学家,资金雄厚的财团,数据存储库和科学期刊仍在很大程度上忽略了这一问题。那么,什么决定了ChIP-seq实验的质量以及由此产生的数据集,又是什么使科学家无法将质量标准与他们的数据相关联呢?在此“意见”中,我们跟踪影响该类型数据集质量的各种参数,以及迄今为止为使它们合格而进行的计算工作。此外,我们描述了一种通用质量控制(QC)认证方法,该方法可为ChIP-seq和富集相关测定提供质量评级。可在上免费访问相应的QC工具和定期更新的数据库,该数据库目前可从中检索8000多个数据集的质量参数。

著录项

相似文献

  • 外文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号