首页> 外文会议>Pacific-Asia conference on knowledge discovery and data mining >Mining Diversified Shared Decision Tree Sets for Discovering Cross Domain Similarities
【24h】

Mining Diversified Shared Decision Tree Sets for Discovering Cross Domain Similarities

机译:挖掘多元化的共享决策树集以发现跨域相似性

获取原文
获取外文期刊封面目录资料

摘要

This paper studies the problem of mining diversified sets of shared decision trees (SDTs). Given two datasets representing two application domains, an SDT is a decision tree that can perform classification on both datasets and it captures class-based population-structure similarity between the two datasets. Previous studies considered mining just one SDT. The present paper considers mining a small diversified set of SDTs having two properties: (1) each SDT in the set has high quality with regard to "shared" accuracy and population-structure similarity and (2) different SDTs in the set are very different from each other. A diversified set of SDTs can serve as a concise representative of the huge space of possible cross-domain similarities, thus offering an effective way for users to examine/select informative SDTs from that huge space. The diversity of an SDT set is measured in terms of the difference of the attribute usage among the SDTs. The paper provides effective algorithms to mine diversified sets of SDTs. Experimental results show that the algorithms are effective and can find diversified sets of high quality SDTs.
机译:本文研究了挖掘多样化的共享决策树(SDT)集的问题。给定两个表示两个应用程序域的数据集,SDT是一个决策树,可以对两个数据集执行分类,并且它捕获两个数据集之间基于类的总体结构相似性。先前的研究认为仅开采一种SDT。本文考虑挖掘具有两个属性的小型SDT集合:(1)集合中的每个SDT在“共享”准确性和总体结构相似性方面均具有较高的质量,并且(2)集合中的不同SDT非常不同彼此。多样化的SDT集可以作为可能跨域相似性的巨大空间的简洁代表,从而为用户提供了一种从该巨大空间中检查/选择信息丰富的SDT的有效方法。根据SDT之间属性使用的差异来衡量SDT集的多样性。本文提供了有效的算法来挖掘各种SDT集。实验结果表明,该算法是有效的,可以找到高质量SDT的多样化集合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号