...
首页> 外文期刊>Distributed and Parallel Databases >Concept acquisition and improved in-database similarity analysis for medical data
【24h】

Concept acquisition and improved in-database similarity analysis for medical data

机译:医学数据的概念获取和改进的数据库内相似性分析

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Efficient identification of cohorts of similar patients is a major precondition for personalized medicine. In order to train prediction models on a given medical data set, similarities have to be calculated for every pair of patientswhich results in a roughly quadratic data blowup. In this paper we discuss the topic of in-database patient similarity analysis ranging from data extraction to implementing and optimizing the similarity calculations in SQL. In particular, we introduce the notion of chunking that uniformly distributes the workload among the individual similarity calculations. Our benchmark comprises the application of one similarity measures (Cosine similariy) and one distance metric (Euclidean distance) on two real-world data sets; it compares the performance of a column store (MonetDB) and a row store (PostgreSQL) with two external data mining tools (ELKI and Apache Mahout).
机译:有效识别相似患者的队列是个性化药物的主要前提。为了在给定的医学数据集上训练预测模型,必须为每对患者计算相似度,这导致大致二次数据爆炸。在本文中,我们讨论数据库内患者相似性分析的主题,范围从数据提取到在SQL中实现和优化相似性计算。特别是,我们引入了分块的概念,可以在各个相似度计算之间均匀地分配工作量。我们的基准测试包括在两个实际数据集上应用一种相似性度量(余弦相似性)和一种距离度量(欧几里得距离);它将列存储(MonetDB)和行存储(PostgreSQL)与两个外部数据挖掘工具(ELKI和Apache Mahout)的性能进行比较。

著录项

  • 来源
    《Distributed and Parallel Databases》 |2019年第2期|297-321|共25页
  • 作者单位

    Univ Goettingen, Inst Comp Sci, Goldschmidtstr 7, D-37077 Gottingen, Germany;

    Univ Goettingen, Inst Comp Sci, Goldschmidtstr 7, D-37077 Gottingen, Germany;

    Univ Goettingen, Inst Comp Sci, Goldschmidtstr 7, D-37077 Gottingen, Germany;

    Univ Goettingen, Inst Comp Sci, Goldschmidtstr 7, D-37077 Gottingen, Germany|King Abdulaziz Univ, Fac Comp & Informat Technol, Jeddah 21589, Saudi Arabia;

    Univ Goettingen, Univ Med Ctr Goettingen, Dept Med Informat, Von Siebold Str 3, D-37075 Gottingen, Germany;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Patient similarity; Row store; Column store; Cosine similarity; Euclidean distance;

    机译:患者相似度;行存储;列存储;余弦相似度;欧氏距离;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号