首页> 外文期刊>Multimedia Tools and Applications >How to measure similarity for multiple categorical data sets?
【24h】

How to measure similarity for multiple categorical data sets?

机译:如何衡量多个分类数据集的相似性?

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

How to measure similarity or distance for multiple categorical data? It is an important step for Data Mining and Knowledge Management process to measure similarity or distance between objects appropriately. Measurements for continuous data have been well-defined and relatively easy to be calculated. However, the notion of similarity for categorical data is not simple, since categorical data usually is not simply translated into the numerical format, and they also have their own priority with structures and data distribution. In this paper, we propose a new measure for multiple categorical data sets using data distribution. Our new measure, MCSM (Multiple Categorical Similarity Measure), can solve conventional drawbacks of multiple categorical data sets successfully in which we prove the verification of our measure with mathematical proofs and experimentation. The experimental result shows that our measure is powerful for multiple categorical data sets with proper data distributions.
机译:如何衡量多个分类数据的相似性或距离?适当地测量对象之间的相似性或距离是数据挖掘和知识管理过程的重要步骤。连续数据的测量是明确定义的,并且相对容易计算。但是,分类数据的相似性概念并不简单,因为分类数据通常不简单地转换为数字格式,并且它们在结构和数据分布上也有自己的优先级。在本文中,我们提出了一种使用数据分布对多个类别数据集进行度量的新方法。我们的新度量MCSM(多重分类相似性度量)可以成功解决多个分类数据集的传统缺陷,其中我们可以通过数学证明和实验来证明对我们度量的验证。实验结果表明,我们的方法对于具有适当数据分布的多个分类数据集是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号