首页> 美国卫生研究院文献>Nucleic Acids Research >Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization
【2h】

Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization

机译:使用矩阵分解从多个相互关联的生物场景的数据中学习通用模式和特定模式

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

High-throughput biological technologies (e.g. ChIP-seq, RNA-seq and single-cell RNA-seq) rapidly accelerate the accumulation of genome-wide omics data in diverse interrelated biological scenarios (e.g. cells, tissues and conditions). Integration and differential analysis are two common paradigms for exploring and analyzing such data. However, current integrative methods usually ignore the differential part, and typical differential analysis methods either fail to identify combinatorial patterns of difference or require matched dimensions of the data. Here, we propose a flexible framework CSMF to combine them into one paradigm to simultaneously reveal >Common and >Specific patterns via >Matrix >Factorization from data generated under interrelated biological scenarios. We demonstrate the effectiveness of CSMF with four representative applications including pairwise ChIP-seq data describing the chromatin modification map between K562 and Huvec cell lines; pairwise RNA-seq data representing the expression profiles of two different cancers; RNA-seq data of three breast cancer subtypes; and single-cell RNA-seq data of human embryonic stem cell differentiation at six time points. Extensive analysis yields novel insights into hidden combinatorial patterns in these multi-modal data. Results demonstrate that CSMF is a powerful tool to uncover common and specific patterns with significant biological implications from data of interrelated biological scenarios.
机译:高通量生物学技术(例如ChIP-seq,RNA-seq和单细胞RNA-seq)在各种相互关联的生物学场景(例如细胞,组织和条件)中迅速加速了全基因组组学数据的积累。集成和差异分析是探索和分析此类数据的两个常见范例。但是,当前的集成方法通常会忽略差异部分,典型的差异分析方法要么无法识别差异的组合模式,要么需要匹配的数据维度。在这里,我们提出了一个灵活的框架CSMF,将它们组合成一个范例,以通过> M atrix >同时显示> C ommon和> S 特定模式从相互关联的生物场景下生成的数据进行F 激活。我们用四个代表性的应用证明了CSMF的有效性,其中包括成对的ChIP-seq数据,描述了K562细胞与Huvec细胞系之间的染色质修饰图。成对的RNA-seq数据表示两种不同癌症的表达谱;三种乳腺癌亚型的RNA-seq数据;和人类胚胎干细胞在六个时间点分化的单细胞RNA-seq数据。广泛的分析产生了对这些多模式数据中隐藏的组合模式的新颖见解。结果表明,CSMF是一个强大的工具,可从相互关联的生物场景数据中发现具有重要生物学意义的常见和特定模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号