...
首页> 外文期刊>Statistics in medicine >Clustering high‐dimensional mixed data to uncover sub‐phenotypes: joint analysis of phenotypic and genotypic data
【24h】

Clustering high‐dimensional mixed data to uncover sub‐phenotypes: joint analysis of phenotypic and genotypic data

机译:聚类高尺寸混合数据以揭示亚表型:表型和基因型数据的联合分析

获取原文
获取原文并翻译 | 示例

摘要

The LIPGENE‐SU.VI.MAX study, like many others, recorded high‐dimensional continuous phenotypic data and categorical genotypic data. LIPGENE‐SU.VI.MAX focuses on the need to account for both phenotypic and genetic factors when studying the metabolic syndrome (MetS), a complex disorder that can lead to higher risk of type 2 diabetes and cardiovascular disease. Interest lies in clustering the LIPGENE‐SU.VI.MAX participants into homogeneous groups or sub‐phenotypes , by jointly considering their phenotypic and genotypic data, and in determining which variables are discriminatory. A novel latent variable model that elegantly accommodates high dimensional, mixed data is developed to cluster LIPGENE‐SU.VI.MAX participants using a Bayesian finite mixture model. A computationally efficient variable selection algorithm is incorporated, estimation is via a Gibbs sampling algorithm and an approximate BIC‐MCMC criterion is developed to select the optimal model. Two clusters or sub‐phenotypes (‘healthy’ and ‘at risk’) are uncovered. A small subset of variables is deemed discriminatory, which notably includes phenotypic and genotypic variables, highlighting the need to jointly consider both factors. Further, 7?years after the LIPGENE‐SU.VI.MAX data were collected, participants underwent further analysis to diagnose presence or absence of the MetS. The two uncovered sub‐phenotypes strongly correspond to the 7‐year follow‐up disease classification, highlighting the role of phenotypic and genotypic factors in the MetS and emphasising the potential utility of the clustering approach in early screening. Additionally, the ability of the proposed approach to define the uncertainty in sub‐phenotype membership at the participant level is synonymous with the concepts of precision medicine and nutrition. Copyright ? 2017 John Wiley & Sons, Ltd.
机译:Lipgene-su.vi.max研究,如许多其他研究,记录了高维连续表型数据和分类基因型数据。 Lipgene-su.vi.max专注于在研究代谢综合征(Mets)时需要考虑表型和遗传因素,这是一种可导致2型糖尿病和心血管疾病的风险较高的复杂病症。利息在于通过联合考虑其表型和基因型数据,将脂蛋白-Su.vi.max参与者聚集成均相组或亚表型。确定哪些变量是歧视性的。一种典型的潜在变量模型,典雅地适应高维,混合数据是使用贝叶斯有限混合物模型的群脂肪素-Su.vi.max参与者开发。结合了计算上有效的变量选择算法,估计是通过GIBBS采样算法的估计,并且开发了近似BIC-MCMC标准以选择最佳模型。发现了两个簇或亚表型(“健康”和“风险”)被揭露。被认为是一种歧视性的小的变量子集,这显着包括表型和基因型变量,突出了联合考虑这两个因素的需要。此外,收集了脂素-Su.Max数据后7岁,参与者接受进一步的分析以诊断满足的存在或不存在。两种未染成的次表型强烈对应于7年的后续疾病分类,突出表型和基因型因子在会科中的作用,并强调在早期筛查中聚类方法的潜在效用。此外,所提出的方法在参与者级别定义亚表型成员资格中的不确定性的能力是精密药物和营养概念的代名词。版权? 2017年John Wiley& SONS,LTD.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号