...
首页> 外文期刊>Nucleic Acids Research >Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization
【24h】

Learning common and specific patterns from data of multiple interrelated biological scenarios with matrix factorization

机译:从矩阵分解的多个相互关联的生物情景数据学习常见和特定模式

获取原文
获取原文并翻译 | 示例
           

摘要

High-throughput biological technologies (e.g. ChIP-seq, RNA-seq and single-cell RNA-seq) rapidly accelerate the accumulation of genome-wide omics data in diverse interrelated biological scenarios (e.g. cells, tissues and conditions). Integration and differential analysis are two common paradigms for exploring and analyzing such data. However, current integrative methods usually ignore the differential part, and typical differential analysis methods either fail to identify combinatorial patterns of difference or require matched dimensions of the data. Here, we propose a flexible framework CSMF to combine them into one paradigm to simultaneously reveal Common and Specific patterns via Matrix Factorization from data generated under interrelated biological scenarios. We demonstrate the effectiveness of CSMF with four representative applications including pairwise ChIP-seq data describing the chromatin modification map between K562 and Huvec cell lines; pairwise RNA-seq data representing the expression profiles of two different cancers; RNA-seq data of three breast cancer subtypes; and single-cell RNA-seq data of human embryonic stem cell differentiation at six time points. Extensive analysis yields novel insights into hidden combinatorial patterns in these multi-modal data. Results demonstrate that CSMF is a powerful tool to uncover common and specific patterns with significant biological implications from data of interrelated biological scenarios.
机译:高通量生物技术(例如芯片-SEQ,RNA-SEQ和单细胞RNA-SEQ)迅速加速了各种相互关联的生物情景(例如细胞,组织和条件)中基因组宽的OMICS数据的积累。集成和差异分析是探索和分析此类数据的两个常见范式。然而,当前的一致性方法通常忽略差分部分,并且典型的差分分析方法无法识别组合模式的差异或需要数据的匹配尺寸。在这里,我们提出了一种灵活的框架CSMF,将它们组合成一个范例,以通过从相互关联的生物情景下产生的数据同时显示矩阵分解的常见和特定模式。我们展示了CSMF具有四种代表性应用的有效性,包括描述K562和Huvec细胞系之间的染色质修饰图的成对芯片-SEQ数据;成对RNA-SEQ数据表示两种不同癌症的表达谱;三种乳腺癌亚型的RNA-SEQ数据;和单细胞RNA-SEQ数据在六个时间点的人胚胎干细胞分化。广泛的分析在这些多模态数据中产生了对隐藏组合模式的新颖见解。结果表明,CSMF是一种强大的工具,可以揭示具有来自相互关联的生物情景数据的具有重要生物学意义的常见和特定模式。

著录项

  • 来源
    《Nucleic Acids Research》 |2019年第13期|共12页
  • 作者

    Zhang Lihua; Zhang Shihua;

  • 作者单位

    Chinese Acad Sci Acad Math &

    Syst Sci NCMIS CEMS RCSDS Beijing 100190 Peoples R China;

    Chinese Acad Sci Acad Math &

    Syst Sci NCMIS CEMS RCSDS Beijing 100190 Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 生物化学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号