首页> 外文学位 >Healthcare Data Mining Using In-database Analytics to Predict Diagnosis of Inflammatory Bowel Disease.
【24h】

Healthcare Data Mining Using In-database Analytics to Predict Diagnosis of Inflammatory Bowel Disease.

机译:使用数据库内分析来预测炎症性肠病的医疗数据挖掘。

获取原文
获取原文并翻译 | 示例

摘要

Inflammatory Bowel Disease is a life-changing affliction with few correlative and no known causal factors for its two major forms. With the widespread use of electronic medical record systems, and therefore the availability of large, highly dimensional, and semi-structured data, the importance of finding effective and scalable data mining algorithms to handle such data has increased dramatically. With these algorithms, one can develop useful predictive and analytical tools for providers and researchers to ultimately improve the quality of patient lives.;In recent years, there has been a growing interest in the classical application of Incremental Gradient Descent techniques to convex programming problems because of their rapid convergence and tolerance to noise. And with the recent development of in-database analytics frameworks leveraging Incremental Gradient Descent algorithms and other user-defined aggregates as in the Bismarck architecture, rapid analysis of large and highly-dimensional data is facilitated.;In this thesis, we describe the first ever application of the Bismarck in-database analytics framework in a healthcare setting. We applied logistic regression using Bismarck on a four-year set of patient demographic, encounter, and hospital account data and produced predictive risk factors for a cohort of Inflammatory Bowel Disease patients. We also developed a simple, automated model builder framework that supports other cohorts of interest, and discuss its design. We also outline our future steps to extend the algorithms to include spatial data analysis and to provide data visualization tools that assist providers and researchers in gaining insight into the correlative factors behind the disease.;The challenges of the clinical data set - large, highly dimensional, heterogenous, with statistically significant amounts of noise - highlight the advantages of the key-value structures the Bismarck architecture leverages. The predictive models produced were better than random and built on commodity hardware running an open source, distributable, database engine. Since the Bismarck in database analytics framework is scalable and parallelizable and facilitates straightforward extension and modification, the success of our application has shown the viability of producing predictive models for other cohorts of interest in a similar healthcare setting.
机译:炎症性肠病是一种改变生活的疾病,由于其两种主要形式,很少有相关且未知的因果关系。随着电子病历系统的广泛使用,以及因此获得的大型,高维和半结构化数据的可用性,寻找有效且可扩展的数据挖掘算法来处理此类数据的重要性已大大提高。借助这些算法,人们可以为提供者和研究人员开发有用的预测和分析工具,以最终改善患者的生活质量。近年来,人们对将增量梯度下降技术用于凸规划问题的经典应用越来越感兴趣。它们的快速收敛性和对噪声的耐受性。并且随着Bismarck架构中利用增量梯度下降算法和其他用户定义的聚合的数据库内分析框架的最新发展,促进了对大型和高维数据的快速分析。 s斯麦数据库内分析框架在医疗机构中的应用。我们使用Bi斯麦(Bismarck)对一组四年的患者人口统计数据,相遇情况和医院账目数据进行了逻辑回归,并为一群炎症性肠病患者产生了预测性危险因素。我们还开发了一个简单的自动化模型构建器框架,该框架可支持其他感兴趣的人群,并讨论其设计。我们还概述了未来的步骤,以扩展算法以包括空间数据分析并提供数据可视化工具,以帮助提供者和研究人员深入了解疾病背后的相关因素。;临床数据集面临的挑战-大型,高维异类且具有统计上显着的噪声-突出了Bi斯麦架构所利用的键值结构的优势。产生的预测模型比随机模型要好,并基于运行开源,可分发数据库引擎的商品硬件构建。由于Bismarck数据库分析框架具有可伸缩性和可并行性,并且便于直接扩展和修改,因此,我们应用程序的成功表明,可以为类似医疗保健环境中的其他感兴趣的人群生成预测模型。

著录项

  • 作者

    Johnson, Eric.;

  • 作者单位

    University of Washington.;

  • 授予单位 University of Washington.;
  • 学科 Biology Biostatistics.;Statistics.;Health Sciences Medicine and Surgery.;Computer Science.
  • 学位 Masters
  • 年度 2012
  • 页码 46 p.
  • 总页数 46
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:43:30

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号