首页> 美国卫生研究院文献>other >Multiple Imputation based Clustering Validation (MIV) for Big Longitudinal Trial Data with Missing Values in eHealth
【2h】

Multiple Imputation based Clustering Validation (MIV) for Big Longitudinal Trial Data with Missing Values in eHealth

机译:eHealth中缺少值的大型纵向试验数据的基于多重归因的聚类验证(MIV)

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Web-delivered trials are an important component in eHealth services. These trials, mostly behavior-based, generate big heterogeneous data that are longitudinal, high dimensional with missing values. Unsupervised learning methods have been widely applied in this area, however, validating the optimal number of clusters has been challenging. Built upon our multiple imputation (MI) based fuzzy clustering, MIfuzzy, we proposed a new multiple imputation based validation (MIV) framework and corresponding MIV algorithms for clustering big longitudinal eHealth data with missing values, more generally for fuzzy-logic based clustering methods. Specifically, we detect the optimal number of clusters by auto-searching and -synthesizing a suite of MI-based validation methods and indices, including conventional (bootstrap or cross-validation based) and emerging (modularity-based) validation indices for general clustering methods as well as the specific one (Xie and Beni) for fuzzy clustering. The MIV performance was demonstrated on a big longitudinal dataset from a real web-delivered trial and using simulation. The results indicate MI-based Xie and Beni index for fuzzy-clustering is more appropriate for detecting the optimal number of clusters for such complex data. The MIV concept and algorithms could be easily adapted to different types of clustering that could process big incomplete longitudinal trial data in eHealth services.
机译:网络提供的试验是eHealth服务中的重要组成部分。这些试验大多基于行为,它们会生成大型的异构数据,这些数据是纵向的,高维的,缺少值。无监督学习方法已在该领域得到广泛应用,但是,验证最佳聚类数量一直是一项挑战。基于我们基于多重插值(MI)的模糊聚类MIfuzzy,我们提出了一种新的基于多重插值的验证(MIV)框架和相应的MIV算法,用于对具有缺失值的大型纵向eHealth数据进行聚类,更普遍地是基于模糊逻辑的聚类方法。具体来说,我们通过自动搜索和综合一套基于MI的验证方法和索引(包括常规的(基于引导或交叉验证的)和新兴的(基于模块化的)验证索引)来检测最佳簇的数量,以用于常规聚类方法以及用于模糊聚类的特定对象(谢和贝尼)。通过真实的网络交付试验并使用仿真,在大型纵向数据集上展示了MIV性能。结果表明,基于MI的Xie和Beni指数用于模糊聚类更适合于检测此类复杂数据的最佳聚类数。 MIV概念和算法可以轻松适应不同类型的聚类,这些聚类可以处理eHealth服务中的大量不完整的纵向试验数据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号