首页> 美国卫生研究院文献>Bioinformatics >Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis
【2h】

Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis

机译:使用联合潜在变量模型对多种基因组数据类型进行整合聚类并应用于乳腺癌和肺癌亚型分析

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

>Motivation: The molecular complexity of a tumor manifests itself at the genomic, epigenomic, transcriptomic and proteomic levels. Genomic profiling at these multiple levels should allow an integrated characterization of tumor etiology. However, there is a shortage of effective statistical and bioinformatic tools for truly integrative data analysis. The standard approach to integrative clustering is separate clustering followed by manual integration. A more statistically powerful approach would incorporate all data types simultaneously and generate a single integrated cluster assignment.>Methods: We developed a joint latent variable model for integrative clustering. We call the resulting methodology iCluster. iCluster incorporates flexible modeling of the associations between different data types and the variance–covariance structure within data types in a single framework, while simultaneously reducing the dimensionality of the datasets. Likelihood-based inference is obtained through the Expectation–Maximization algorithm.>Results: We demonstrate the iCluster algorithm using two examples of joint analysis of copy number and gene expression data, one from breast cancer and one from lung cancer. In both cases, we identified subtypes characterized by concordant DNA copy number changes and gene expression as well as unique profiles specific to one or the other in a completely automated fashion. In addition, the algorithm discovers potentially novel subtypes by combining weak yet consistent alteration patterns across data types.>Availability: R code to implement iCluster can be downloaded at >Contact: >Supplementary information: are available at Bioinformatics online.
机译:>动机:肿瘤的分子复杂性体现在基因组,表观基因组,转录组和蛋白质组学水平。在这些多个水平上的基因组谱分析应允许肿瘤病因学的综合表征。然而,缺乏有效的统计和生物信息学工具来进行真正的综合数据分析。集成集群的标准方法是单独的集群,然后进行手动集成。一种统计上更强大的方法将同时合并所有数据类型并生成单个集成集群分配。>方法:我们开发了用于集成集群的联合潜在变量模型。我们称这种方法为iCluster。 iCluster在单个框架中结合了对不同数据类型与数据类型内方差-协方差结构之间的关联进行灵活建模的功能,同时降低了数据集的维数。通过“期望最大化”算法获得基于可能性的推断。>结果:我们使用拷贝数和基因表达数据的两个联合分析示例展示了iCluster算法,一个来自乳腺癌,另一个来自肺癌。在这两种情况下,我们都以完全自动化的方式确定了以一致的DNA拷贝数变化和基因表达为特征的亚型,以及对一个或另一个特异的独特特征。此外,该算法通过组合跨数据类型的弱而一致的变更模式来发现潜在的新型子类型。>可用性:可以在以下地址下载用于实现iCluster的R代码:>联系人: >补充信息:可从在线生物信息学获得。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号