首页> 外文会议>IEEE International Symposium on Computer-Based Medical Systems >Discovering Data Source Stability Patterns in Biomedical Repositories Based on Simplicial Projections from Probability Distribution Distances
【24h】

Discovering Data Source Stability Patterns in Biomedical Repositories Based on Simplicial Projections from Probability Distribution Distances

机译:基于概率分布距离的简单投影发现生物医学存储库中的数据源稳定性模式

获取原文

摘要

The degree of homogeneity of statistical distributions among data sources is a critical issue when reusing data of Integrated Data Repositories (IDR). Evaluating this data source stability is of utmost importance in order to ensure a confident data reuse. This work tackles the task of discovering and classifying patterns among the statistical distributions of multiple sources in IDRs, by means of a novel approach based on simplicial projections from probability distribution distances, combined with Density-based spatial clustering of applications with noise (DBSCAN). The results on the evaluated 20 public repositories support the existence of four main data source stability patterns in biomedical repositories: the global stability pattern (GSP), the local stability pattern (LSP), the sparse stability pattern (SSP) and the instability pattern (IP).
机译:重用集成数据存储库(IDR)的数据时,数据源之间统计分布的同质程度是一个关键问题。为了确保可靠的数据重用,评估此数据源的稳定性至关重要。这项工作通过一种基于概率分布距离的简单投影的新颖方法,结合基于密度的基于噪声的应用程序空间聚类(DBSCAN),解决了在IDR中多个源的统计分布之间发现和分类模式的任务。评估的20个公共存储库的结果支持生物医学存储库中存在四种主要数据源稳定性模式:全局稳定性模式(GSP),局部稳定性模式(LSP),稀疏稳定性模式(SSP)和不稳定性模式( IP)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号