首页> 美国卫生研究院文献>other >Inferring network structure in non-normal and mixed discrete-continuous genomic data
【2h】

Inferring network structure in non-normal and mixed discrete-continuous genomic data

机译:非正常和混合离散连续基因组数据中的网络结构推断

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Inferring dependence structure through undirected graphs is crucial for uncovering the major modes of multivariate interaction among high-dimensional genomic markers that are potentially associated with cancer. Traditionally, conditional independence has been studied using sparse Gaussian graphical models for continuous data and sparse Ising models for discrete data. However, there are two clear situations when these approaches are inadequate. The first occurs when the data are continuous but display non-normal marginal behavior such as heavy tails or skewness, rendering an assumption of normality inappropriate. The second occurs when a part of the data is ordinal or discrete (e.g., presence or absence of a mutation) and the other part is continuous (e.g., expression levels of genes or proteins). In this case, the existing Bayesian approaches typically employ a latent variable framework for the discrete part that precludes inferring conditional independence among the data that are actually observed. The current article overcomes these two challenges in a unified framework using Gaussian scale mixtures. Our framework is able to handle continuous data that are not normal and data that are of mixed continuous and discrete nature, while still being able to infer a sparse conditional sign independence structure among the observed data. Extensive performance comparison in simulations with alternative techniques and an analysis of a real cancer genomics data set demonstrate the effectiveness of the proposed approach.
机译:通过无向图推断依赖性结构对于揭示可能与癌症相关的高维基因组标记之间的多元相互作用的主要模式至关重要。传统上,条件稀疏度已使用连续数据的稀疏高斯图形模型和离散数据的稀疏伊辛模型进行了研究。但是,当这些方法不足时,有两种明显的情况。当数据连续但显示非正常的边际行为(例如粗尾或偏斜)时,会出现第一种情况,从而导致对正态性的假设不合适。当一部分数据是有序的或离散的(例如,存在或不存在突变)而另一部分是连续的(例如,基因或蛋白质的表达水平)时,就会发生第二种情况。在这种情况下,现有的贝叶斯方法通常对离散部分采用潜在变量框架,从而无法推断实际观察到的数据之间的条件独立性。本文在使用高斯比例混合的统一框架中克服了这两个挑战。我们的框架能够处理非正常的连续数据以及具有连续性和离散性的混合数据,同时仍能够在观察到的数据中推断出稀疏的条件符号独立性结构。使用替代技术进行的模拟中的广泛性能比较以及对真实癌症基因组学数据集的分析证明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号