首页> 外文期刊>Biometrical Journal >Clustering multiply imputed multivariate high-dimensional longitudinal profiles
【24h】

Clustering multiply imputed multivariate high-dimensional longitudinal profiles

机译:聚类乘以算型多变量高维纵向概况

获取原文
获取原文并翻译 | 示例
           

摘要

In this paper, we propose a method to cluster multivariate functional data with missing observations. Analysis of functional data often encompasses dimension reduction techniques such as principal component analysis (PCA). These techniques require complete data matrices. In this paper, the data are completed by means of multiple imputation, and subsequently each imputed data set is submitted to a cluster procedure. The final partition of the data, summarizing the partitions obtained for the imputed data sets, is obtained by means of ensemble clustering. The uncertainty in cluster membership, due to missing data, is characterized by means of the agreement between the members of the ensemble and fuzziness of the consensus clustering. The potential of the method was brought out on the heart failure (HF) data. Daily measurement for four biomarkers (heart rate, diastolic, and systolic blood pressure, weight) were used to cluster the patients. To normalize the distributions of the longitudinal outcomes, the data were transformed with a natural logarithm function. A cubic spline base with 69 basis functions was employed to smooth the profiles. The proposed algorithm indicates the existence of a latent structure and divides the HF patients into two clusters, showing a different evolution in blood pressure values and weight. In general, cluster results are sensitive to choices made. Likewise for the proposed approach, alternative choices for the distance measure, procedure to optimize the objective function, choice of the scree-test threshold, or the number of principal components, to be used in the approximation of the surrogate density, could all influence the final partition. For the HF data set, the final partition depends on the number of principal components used in the procedure.
机译:在本文中,我们提出了一种用缺失观察组成多变量功能数据的方法。功能数据分析通常包括尺寸减少技术,例如主成分分析(PCA)。这些技术需要完整的数据矩阵。在本文中,通过多重归纳完成数据,随后将每个避税数据集提交给集群过程。通过集群聚类获得的数据总结所获得的分区的数据的最终分区。由于缺失数据,集群成员资格的不确定性是通过协议与共识聚类的集群成员之间的协议的特征。该方法的潜力是在心力衰竭(HF)数据上。每日测量四种生物标志物(心率,舒张,和收缩压,重量)用于聚类患者。为了归一化纵向结果的分布,通过自然对数函数转换数据。采用具有69个基函数的立方样条底座来平滑曲线。所提出的算法表明存在潜在结构并将HF患者分成两种簇,显示出血压值和重量的不同演变。通常,群集结果对所做的选择敏感。同样,对于所提出的方法,距离测量的替代选择,过程优化目标函数,选择刷子测试阈值的选择,或者是主要组件的数量,以用于替代密度的近似,都可以影响最后分区。对于HF数据集,最终分区取决于过程中使用的主要组件的数量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号