首页> 美国卫生研究院文献>Genomics Data >Assessing quality standards for ChIP-seq and related massive parallel sequencing-generated datasets: When rating goes beyond avoiding the crisis

【2h】

Assessing quality standards for ChIP-seq and related massive parallel sequencing-generated datasets: When rating goes beyond avoiding the crisis

机译：评估ChIP-seq和相关的大规模并行测序生成的数据集的质量标准：当评级超出避免危机的程度时

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Massive parallel DNA sequencing combined with chromatin immunoprecipitation and a large variety of DNA/RNA-enrichment methodologies is at the origin of data resources of major importance. Indeed these resources, available for multiple genomes, represent the most comprehensive catalogue of (i) cell, development and signal transduction-specified patterns of binding sites for transcription factors (‘cistromes’) and for transcription and chromatin modifying machineries and (ii) the patterns of specific local post-translational modifications of histones and DNA (‘epigenome’) or of regulatory chromatin binding factors. In addition, (iii) the resources specifying chromatin structure alterations are emerging. Importantly, these types of “omics” datasets populate increasingly public repositories and provide highly valuable resources for the exploration of general principles of cell function in a multi-dimensional genome–transcriptome–epigenome–chromatin structure context. However, data mining is critically dependent on the data quality, an issue that, surprisingly, is still largely ignored by scientists and well-financed consortia, data repositories and scientific journals. So what determines the quality of ChIP-seq experiments and the datasets generated therefrom and what refrains scientists from associating quality criteria to their data? In this ‘opinion’ we trace the various parameters that influence the quality of this type of datasets, as well as the computational efforts that were made until now to qualify them. Moreover, we describe a universal quality control (QC) certification approach that provides a quality rating for ChIP-seq and enrichment-related assays. The corresponding QC tool and a regularly updated database, from which at present the quality parameters of more than 8000 datasets can be retrieved, are freely accessible at .

机译：大规模并行DNA测序结合染色质免疫沉淀法和多种DNA / RNA富集方法是最重要的数据资源的起源。实际上，这些可用于多个基因组的资源代表了以下方面的最全面的目录：（i）细胞，发育和信号传导指定的转录因子结合位点模式（“ cistromes”）以及转录和染色质修饰机制，以及（ii）组蛋白和DNA（“表观基因组”）或染色质调节结合因子的特定局部翻译后修饰的模式。此外，（iii）出现了指定染色质结构改变的资源。重要的是，这些类型的“组学”数据集越来越多地出现在公共存储库中，并为在多维基因组-转录组-表观基因组-染色质结构背景下探索细胞功能的一般原理提供了极有价值的资源。但是，数据挖掘在很大程度上取决于数据质量，令人惊讶的是，科学家，资金雄厚的财团，数据存储库和科学期刊仍在很大程度上忽略了这一问题。那么，什么决定了ChIP-seq实验的质量以及由此产生的数据集，又是什么使科学家无法将质量标准与他们的数据相关联呢？在此“意见”中，我们跟踪影响该类型数据集质量的各种参数，以及迄今为止为使它们合格而进行的计算工作。此外，我们描述了一种通用质量控制（QC）认证方法，该方法可为ChIP-seq和富集相关测定提供质量评级。可在上免费访问相应的QC工具和定期更新的数据库，该数据库目前可从中检索8000多个数据集的质量参数。

著录项

期刊名称 Genomics Data
作者
Marco Antonio Mendoza-Parra; Hinrich Gronemeyer;
展开▼
作者单位

展开▼
年(卷),期 2014(2),-1
年度 2014
页码 268–273
总页数 6
原文格式 PDF
正文语种
中图分类生化遗传学;生化药理学;
关键词
ChIP sequencing Massive parallel sequencing Quality control Omics data mining;

机译：ChIP测序;大规模并行测序;质量控制;Omics数据挖掘;

相似文献

外文文献
专利

1. Assessing quality standards for ChIP-seq and related massive parallel sequencing-generated datasets: When rating goes beyond avoiding the crisis [J] . Marco Antonio Mendoza-Parra, Hinrich Gronemeyer Genomics Data . 2014,第2期

机译：评估ChIP-seq和相关的大规模并行测序生成的数据集的质量标准：当评级超出避免危机的程度时
2. A new computational method to predict transcriptional activity of a DNA sequence from diverse datasets of massively parallel reporter assays [J] . Nucleic Acids Research . 2017,第13期

机译：一种新的计算方法，以预测大规模平行报告分析的不同数据集的DNA序列的转录活性
3. A new computational method to predict transcriptional activity of a DNA sequence from diverse datasets of massively parallel reporter assays [J] . Takuma Irie, Tetsushi Yada, Ying Liu, Nucleic acids research . 2017,第13期

机译：一种新的计算方法，可从大规模平行报道基因分析的不同数据集中预测DNA序列的转录活性
4. Improving Collaborative Filtering's Rating Prediction Quality in Dense Datasets, by Pruning Old Ratings [C] . Dionisis Margaris, Costas Vassilakis IEEE Symposium on Computers and Communications . 2017

机译：通过修剪旧评级，提高致密数据集的协同过滤的评定预测质量
5. Combinatorial Optimization on Massive Datasets: Streaming, Distributed, and Massively Parallel Computation [D] . Assadi, Sepehr. 2018

机译：大规模数据集的组合优化：流式，分布式和大规模并行计算
6. FisherMP: fully parallel algorithm for detecting combinatorial motifs from large ChIP-seq datasets [O] . Shaoqiang Zhang, Ying Liang, Xiangyun Wang, 2019

机译：FisherMP：完全并行的算法用于从大型ChIP-seq数据集中检测组合图案
7. Assessing quality standards for ChIP-seq and related massive parallel sequencing-generated datasets: When rating goes beyond avoiding the crisis [O] . Marco Antonio Mendoza-Parra, Hinrich Gronemeyer 2014

机译：评估ChIp-seq及相关大规模并行测序生成数据集的质量标准：当评级超出避免危机时
8. Faster Parallel Algorithm and Efficient Multithreaded Implementations for Evaluating Betweenness Centrality on Massive Datasets [R] . Madduri, K., Ediger, D., Jiang, K., 2008

机译：更快的并行算法和高效的多线程实现，用于评估海量数据集的中介中心性

Assessing quality standards for ChIP-seq and related massive parallel sequencing-generated datasets: When rating goes beyond avoiding the crisis

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅