...
首页> 外文期刊>International Journal of Applied Engineering Research >WDset: A Semantic Distance Measure for Imbalanced Datasets
【24h】

WDset: A Semantic Distance Measure for Imbalanced Datasets

机译:WDSet:非平数据集的语义距离测量值

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The concept of similarity/dissimilarity can be used to solve classification problems in imbalanced relational datasets where each record is a set of values corresponding to different attributes. Here, each attribute in the dataset has semantics, and therefore, it has a different effect on determination of category of the records. The distance measures that are commonly used to calculate similarity of the records compromise the semantics of the attributes rather than preserving it. In this paper, we propose a semantic distance measure named WDset which is a set of weighted distance components from two records. Here, individual distance components are calculated and are assigned a weight based on their relevance. Individual distance components preserve the semantics and weight handles the imbalanced nature of the dataset. Experimental tryouts show that the proposed measure is able to identify similar records more appropriately as compared to the existing measures.
机译:相似性/不相似性的概念可用于解决不平衡关系数据集中的分类问题,其中每个记录是与不同属性相对应的一组值。 在这里,数据集中的每个属性都具有语义,因此,它对确定记录类别的效果不同。 常用于计算记录相似性的距离测量会损害属性的语义而不是保留它。 在本文中,我们提出了名为WDSet的语义距离测量,这是来自两个记录的一组加权距离分量。 这里,计算各个距离分量,并基于它们的相关性分配权重。 各个距离分量保留语义和权重,处理数据集的不平衡性质。 实验试验表明,与现有措施相比,该措施能够更适当地识别类似的记录。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号