首页> 外文会议>Intelligent Systems, 2008 4th International IEEE Conference >Analysis of stopping criteria for the EM algorithm in the context of patient grouping according to length of stay
【24h】

Analysis of stopping criteria for the EM algorithm in the context of patient grouping according to length of stay

机译:根据住院时间的长短,在患者分组的情况下分析EM算法的停止标准

获取原文

摘要

The expectation maximisation (EM) algorithm is an iterative maximum likelihood procedure often used for estimating the parameters of a mixture model. Theoretically, increases in the likelihood function are guaranteed as the algorithm iteratively improves upon previously derived parameter estimates. The algorithm is considered to converge when all parameter estimates become stable and no further improvements can be made to the likelihood value. However, to reduce computational time, it is often common practice for the algorithm to be stopped before complete convergence using heuristic approaches. In this paper, we consider various stopping criteria and evaluate their effect on fitting Gaussian mixture models (GMMs) to patient length of stay (LOS) data. Although the GMM can be successfully fitted to positively skewed data such as LOS, the fitting procedure often requires many iterations of the EM algorithm. To our knowledge, no previous study has evaluated the effect of different stopping criteria on fitting GMMs to skewed distributions. Hence, the aim of this paper is to evaluate the effect of various stopping criteria in order to select and justify their use within a patient spell classification methodology. Results illustrate that criteria based on the difference in the likelihood value and on the GMM parameters may not always be a good indicator for stopping the algorithm. In fact we show that the values of the difference in the variance parameters should be used instead, as these parameters are the last to stabilise. In addition, we also specify threshold values for the other stopping criteria.
机译:期望最大化(EM)算法是一种迭代最大似然过程,通常用于估计混合模型的参数。从理论上讲,随着算法对先前导出的参数估计值进行迭代改进,可以保证似然函数的增加。当所有参数估计都变得稳定并且无法对似然值进行进一步的改进时,该算法被视为收敛。但是,为了减少计算时间,通常的做法是在使用启发式方法进行完全收敛之前先停止算法。在本文中,我们考虑了各种停止标准,并评估了它们对将高斯混合模型(GMM)拟合到患者住院时间(LOS)数据的影响。尽管GMM可以成功地拟合到诸如LOS之类的正偏数据,但是拟合过程通常需要EM算法的许多迭代。据我们所知,没有以前的研究评估过不同的停止标准对GMM拟合偏斜分布的影响。因此,本文的目的是评估各种停止标准的效果,以便在患者拼写分类方法中选择和证明其使用依据。结果表明,基于似然值差异和GMM参数的标准可能并不总是是停止算法的良好指标。实际上,我们表明应该使用方差参数中的差异值代替,因为这些参数是最后要稳定的参数。此外,我们还为其他停止标准指定了阈值。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号