首页> 美国卫生研究院文献>Statistical Applications in Genetics and Molecular Biology >A Bayesian semiparametric factor analysis model for subtype identification
【2h】

A Bayesian semiparametric factor analysis model for subtype identification

机译:用于子类型识别的贝叶斯半参数因子分析模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Disease subtype identification (clustering) is an important problem in biomedical research. Gene expression profiles are commonly utilized to infer disease subtypes, which often lead to biologically meaningful insights into disease. Despite many successes, existing clustering methods may not perform well when genes are highly correlated and many uninformative genes are included for clustering due to the high dimensionality. In this article, we introduce a novel subtype identification method in the Bayesian setting based on gene expression profiles. This method, called BCSub, adopts an innovative semiparametric Bayesian factor analysis model to reduce the dimension of the data to a few factor scores for clustering. Specifically, the factor scores are assumed to follow the Dirichlet process mixture model in order to induce clustering. Through extensive simulation studies, we show that BCSub has improved performance over commonly used clustering methods. When applied to two gene expression datasets, our model is able to identify subtypes that are clinically more relevant than those identified from the existing methods.
机译:疾病亚型识别(聚类)是生物医学研究中的重要问题。基因表达谱通常用于推断疾病亚型,这通常会导致对疾病具有生物学意义的见解。尽管取得了许多成功,但是当基因高度相关并且由于高维而包含许多非信息基因进行聚类时,现有的聚类方法可能无法很好地执行。在本文中,我们介绍了一种基于基因表达谱的贝叶斯环境中新的亚型识别方法。这种称为BCSub的方法采用了创新的半参数贝叶斯因子分析模型,可以将数据的维数减少到几个因子得分以进行聚类。具体而言,假设因子得分遵循Dirichlet过程混合模型以诱导聚类。通过广泛的仿真研究,我们表明BCSub与常用的聚类方法相比,具有更高的性能。当应用于两个基因表达数据集时,我们的模型能够识别出与从现有方法中鉴定出的亚型在临床上更相关的亚型。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号