首页> 外文会议>Annual International Conference of the IEEE Engineering in Medicine and Biology Society >Comparative study of probability distribution distances to define a metric for the stability of multi-source biomedical research data
【24h】

Comparative study of probability distribution distances to define a metric for the stability of multi-source biomedical research data

机译:概率分布距离对多源生物医学研究数据稳定性定义度量的比较研究

获取原文

摘要

Research biobanks are often composed by data from multiple sources. In some cases, these different subsets of data may present dissimilarities among their probability density functions (PDF) due to spatial shifts. This, may lead to wrong hypothesis when treating the data as a whole. Also, the overall quality of the data is diminished. With the purpose of developing a generic and comparable metric to assess the stability of multi-source datasets, we have studied the applicability and behaviour of several PDF distances over shifts on different conditions (such as uni- and multivariate, different types of variable, and multi-modality) which may appear in real biomedical data. From the studied distances, we found information-theoretic based and Earth Mover's Distance to be the most practical distances for most conditions. We discuss the properties and usefulness of each distance according to the possible requirements of a general stability metric.
机译:研究Biobanks通常由来自多个来源的数据组成。在某些情况下,由于空间偏移,这些不同的数据子集可以在其概率密度函数(PDF)之间存在异化。这可能会在处理整个数据时导致错误的假设。此外,数据的整体质量会减少。为了开发通用和可比度量来评估多源数据集的稳定性,我们研究了几个PDF距离在不同条件下的差移的适用性和行为(例如单级和多变量,不同类型的变量和可以出现在真实生物医学数据中的多种方式。从研究的距离,我们发现信息 - 理论基于物理和地球移动器的距离是大多数条件的最实用的距离。我们根据一般稳定度量的可能要求讨论每个距离的性质和有用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号