首页> 外文学位 >Evaluating the portability of health care data to SQL like Big Data environment.
【24h】

Evaluating the portability of health care data to SQL like Big Data environment.

机译:评估医疗保健数据到像大数据环境这样的SQL的可移植性。

获取原文
获取原文并翻译 | 示例

摘要

Big Data deals with huge-volumes of complex, exponentially growing data sets from multiple, sources. With rapid growth in networking we are now able to generate immense amount of data in almost any field imaginable, including physical, biological and biomedical sciences. While most industries have been far more successful at harnessing the value from large-scale integration and analysis of big data, the health care industry is just getting its feet wet. One impediment for the Health Care industry adoption of Big Data analytics has been the dependence of many of their models on the RDBMS technology. With the diversity and amounts of data in health care industry there is an increasing need to evaluate components in big data frameworks and gauge their adaptability to analytics techniques. However, recent developments in the Hadoop ecosystem environment have led to breakthroughs enabling RDBMS like tools in big data environments. In this paper we evaluate the portability of existing RDBMS solutions employing such SQL like big data tools. Our work focuses on benchmarking multiple SQL like big data technologies over HDFS for Study Data Tabulation Model (SDTM) used in clinical trial databases for improving the efficiency of research in clinical trials. We will examine their potential for improving the efficiency of research in big data clinical trials. Publicly available healthcare data (from National Institute of Drug Abuse (NIDA)) is utilized as a test bed to measure key parameters like usability, adaptability and modularity, robustness and efficiency. Our intention is to demonstrate the portability of the execution of ad-hoc SQL queries on the fly occurring in current clinical trial functionality and evaluate if it can be replicated in a big data SQL like back-end system with relative ease and transparency.
机译:大数据处理来自多个来源的大量复杂,呈指数增长的数据集。随着网络的快速发展,我们现在能够在几乎任何可以想象的领域(包括物理,生物和生物医学科学)中生成大量数据。尽管大多数行业在利用大规模集成和大数据分析带来的价值方面取得了更大的成功,但医疗保健行业才刚刚起步。卫生保健行业采用大数据分析的一个障碍是其许多模型都依赖RDBMS技术。随着医疗保健行业中数据的多样性和数量的增加,越来越需要评估大数据框架中的组件并评估其对分析技术的适应性。但是,Hadoop生态系统环境的最新发展带来了突破,使RDBMS像大数据环境中的工具一样。在本文中,我们评估了使用像大数据工具这样的SQL的现有RDBMS解决方案的可移植性。我们的工作重点是通过HDFS对用于临床试验数据库中的研究数据列表模型(SDTM)的多个SQL之类的大数据技术进行基准测试,以提高临床试验的研究效率。我们将研究它们在提高大数据临床试验研究效率方面的潜力。公开可用的医疗数据(来自美国药物滥用研究所(NIDA))被用作测试床,以测量关键参数,如可用性,适应性和模块化,稳健性和效率。我们的目的是演示在当前临床试验功能中即时执行即席SQL查询的可移植性,并评估它是否可以相对容易和透明地复制到后端系统等大数据SQL中。

著录项

  • 作者

    Grover, Akshay.;

  • 作者单位

    University of Maryland, Baltimore County.;

  • 授予单位 University of Maryland, Baltimore County.;
  • 学科 Computer science.;Information science.
  • 学位 M.S.
  • 年度 2015
  • 页码 62 p.
  • 总页数 62
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:52:26

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号