首页> 外文OA文献 >Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data
【2h】

Genome-Wide Localization of Protein-DNA Binding and Histone Modification by a Bayesian Change-Point Method with ChIP-seq Data

机译:蛋白质-DNA结合和全组蛋白修饰的全基因组定位,通过ChIP-seq数据的贝叶斯变化点方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Next-generation sequencing (NGS) technologies have matured considerably since their introduction and a focus has been placed on developing sophisticated analytical tools to deal with the amassing volumes of data. Chromatin immunoprecipitation sequencing (ChIP-seq), a major application of NGS, is a widely adopted technique for examining protein-DNA interactions and is commonly used to investigate epigenetic signatures of diffuse histone marks. These datasets have notoriously high variance and subtle levels of enrichment across large expanses, making them exceedingly difficult to define. Windows-based, heuristic models and finite-state hidden Markov models (HMMs) have been used with some success in analyzing ChIP-seq data but with lingering limitations. To improve the ability to detect broad regions of enrichment, we developed a stochastic Bayesian Change-Point (BCP) method, which addresses some of these unresolved issues. BCP makes use of recent advances in infinite-state HMMs by obtaining explicit formulas for posterior means of read densities. These posterior means can be used to categorize the genome into enriched and unenriched segments, as is customarily done, or examined for more detailed relationships since the underlying subpeaks are preserved rather than simplified into a binary classification. BCP performs a near exhaustive search of all possible change points between different posterior means at high-resolution to minimize the subjectivity of window sizes and is computationally efficient, due to a speed-up algorithm and the explicit formulas it employs. In the absence of a well-established "gold standard" for diffuse histone mark enrichment, we corroborated BCP's island detection accuracy and reproducibility using various forms of empirical evidence. We show that BCP is especially suited for analysis of diffuse histone ChIP-seq data but also effective in analyzing punctate transcription factor ChIP datasets, making it widely applicable for numerous experiment types.
机译:自从引入下一代测序(NGS)技术以来,它们已经相当成熟,并且其重点已放在开发复杂的分析工具上,以处理大量数据。染色质免疫沉淀测序(ChIP-seq)是NGS的主要应用,是一种广泛用于检查蛋白质与DNA相互作用的技术,通常用于研究弥散组蛋白标记的表观遗传学特征。众所周知,这些数据集具有很大的方差,并且在大范围内的富集水平微不足道,因此很难定义。基于Windows的启发式模型和有限状态隐马尔可夫模型(HMM)已被成功用于分析ChIP-seq数据,但存在局限性。为了提高检测丰富区域的能力,我们开发了一种随机贝叶斯变化点(BCP)方法,该方法可以解决其中一些未解决的问题。 BCP通过获得用于读取密度的后验方法的显式公式,来利用无限状态HMM的最新进展。这些后验方法可用于按常规方式将基因组分为富集和未富集的片段,也可以检查更详细的关系,因为基本的亚峰得以保留而不是简化为二元分类。由于加速算法及其所采用的明确公式,BCP可以高分辨率对不同后验装置之间的所有可能变化点进行近乎详尽的搜索,以最大程度地减小窗口大小的主观性,并且计算效率高。在缺乏用于弥散组蛋白标记富集的公认的“金标准”的情况下,我们使用各种形式的经验证据来证实BCP的岛检测准确性和可重复性。我们显示BCP特别适合分析弥散组蛋白ChIP-seq数据,但也能有效地分析点状转录因子ChIP数据集,使其广泛适用于多种实验类型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号