...
首页> 外文期刊>Advances in data analysis and classification >Basic statistics for distributional symbolic variables: a new metric-based approach
【24h】

Basic statistics for distributional symbolic variables: a new metric-based approach

机译:分布符号变量的基本统计数据:一种基于度量的新方法

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In data mining it is usual to describe a group of measurements using summary statistics or through empirical distribution functions. Symbolic data analysis (SDA) aims at the treatment of such kinds of data, allowing the description and the analysis of conceptual data or of macrodata summarizing classical data. In the conceptual framework of SDA, the paper aims at presenting new basic statistics for distribution-valued variables, i.e., variables whose realizations are distributions. The proposed measures extend some classical univariate (mean, variance, standard deviation) and bivariate (covariance and correlation) basic statistics to distribution-valued variables, taking into account the nature and the variability of such data. The novel statistics are based on a distance between distributions: the Wasserstein distance. A comparison with other univariate and bivariate statistics presented in the literature points out some relevant properties of the proposed ones. An application on a clinic dataset shows the main differences in terms of interpretation of results.
机译:在数据挖掘中,通常使用汇总统计数据或通过经验分布函数描述一组度量。符号数据分析(SDA)旨在处理此类数据,从而允许对概念数据或概括经典数据的宏数据进行描述和分析。在SDA的概念框架中,本文旨在为分布值变量(即其实现为分布的变量)提供新的基本统计数据。拟议的措施考虑到此类数据的性质和可变性,将一些经典的单变量(均值,方差,标准差)和双变量(协方差和相关性)基本统计信息扩展到分布值变量。新的统计数据基于分布之间的距离:Wasserstein距离。与文献中提供的其他单变量和双变量统计数据的比较指出了所提出的一些相关属性。临床数据集上的应用程序显示了在结果解释方面的主要差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号