首页> 外国专利> Method and apparatus for classification of data by aggregating emerging patterns

Method and apparatus for classification of data by aggregating emerging patterns

机译:通过汇总新兴模式对数据进行分类的方法和装置

摘要

Emerging patterns (EPs) are itemsets having supports that change significantly from one dataset to another. A classifier, CAEP, is disclosed using the following main ideas based on EPs: (i) Each EP can sharply differentiate the class membership of a (possibly small) fraction of instances containing the EP, due to the big difference between the EP's supports in the opposing classes; the differentiating power of the EP is defined in terms of the EP's supports and ratio, on instances containing the EP. (ii) For each instance t, by aggregating (124) the differentiating power of a fixed, automatically selected set of EPs, a score is obtained for each class (126). The scores for all classes are normalized (144) and the largest score determines t's class (146). CAEP is suitable for many applications, even those with large volumes of high dimensional data. CAEP does not depend on dimension reduction on data and is usually equally accurate on all classes even if their populations are unbalanced.
机译:新兴模式(EPs)是具有从一个数据集到另一个数据集的显着变化的支持的项目集。使用以下基于EP的主要思想公开了一种分类器CAEP:(i)每个EP可以显着地区分包含EP的实例(可能很小)的一部分实例的类成员身份,这是因为EP的支持之间存在很大差异反对的阶级;在包含EP的实例上,根据EP的支持度和比率来定义EP的区分能力。 (ii)对于每个实例t,通过合计( 124 )固定的,自动选择的一组EP的区分能力,可以获得每个类别的分数( 126 ) 。所有类别的分数均已归一化( 144 ),最大分数决定了t的类别( 146 )。 CAEP适用于许多应用程序,即使是具有大量高维数据的应用程序。 CAEP并不依赖于数据的降维,即使所有类别的人口不均衡,CAEP通常同样准确。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号