首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >LODE: A distance-based classifier built on ensembles of positive and negative observations
【24h】

LODE: A distance-based classifier built on ensembles of positive and negative observations

机译:LODE:基于正负观测值的基于距离的分类器

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Current work on assembling a set of local patterns such as rules and class association rules into a global model for the prediction of a target usually focuses on the identification of the minimal set of patterns that cover the training data. In this paper we present a different point of view: the model of a class has been built with the purpose to emphasize the typical features of the examples of the class. Typical features are modeled by frequent itemsets extracted from the examples and constitute a new representation space of the examples of the class. Prediction of the target class of test examples occurs by computation of the distance between the vector representing the example in the space of the itemsets of each class and the vectors representing the classes. It is interesting to observe that in the distance computation the critical contribution to the discrimination between classes is given not only by the itemsets of the class model that match the example but also by itemsets that do not match the example. These absent features constitute some pieces of information on the examples that can be considered for the prediction and should not be disregarded. Second, absent features are more abundant in the wrong classes than in the correct ones and their number increases the distance between the example vector and the negative class vectors. Furthermore, since absent features are frequent features in their respective classes, they make the prediction more robust against over-fitting and noise. The usage of features absent in the test example is a novel issue in classification: existing learners usually tend to select the best local pattern that matches the example and do not consider the abundance of other patterns that do not match it. We demonstrate the validity of our observations and the effectiveness of LODE, our learner, by means of extensive empirical experiments in which we compare the prediction accuracy of LODE with a consistent set of classifiers of the state of the art. In this paper we also report the methodology that we adopted in order to determine automatically the setting of the learner and of its parameters.
机译:当前关于将一组局部模式(例如规则和类关联规则)组装到用于预测目标的全局模型中的工作通常着重于识别覆盖训练数据的最小模式集。在本文中,我们提出了不同的观点:建立了一个班级模型,目的是强调班级示例的典型特征。典型特征通过从示例中提取的频繁项集进行建模,并构成该类示例的新表示空间。通过计算每个类别的项目集空间中代表示例的向量与代表类别的向量之间的距离,可以对测试示例的目标类别进行预测。有趣的是,在距离计算中,不仅通过与示例匹配的类模型的项集,而且还通过与示例不匹配的项集,对类之间的区别做出了至关重要的贡献。这些缺少的特征构成了有关示例的一些信息,可以将这些信息考虑用于预测,并且不应忽略。其次,错误类别中的缺失特征比正确类别中的缺失特征更为丰富,并且它们的数量增加了示例向量与否定类别向量之间的距离。此外,由于缺少的特征在它们各自的类别中是常见的特征,因此它们使预测对于过度拟合和噪声更鲁棒。测试示例中缺少的功能的使用在分类中是一个新问题:现有的学习者通常倾向于选择与示例匹配的最佳局部模式,而不考虑其他与示例不匹配的模式的丰富性。我们通过广泛的经验实验证明了观察结果的有效性和学习者LODE的有效性,在这些实验中,我们将LODE的预测准确性与最新技术水平的一致分类器进行了比较。在本文中,我们还报告了为了自动确定学习者及其参数设置而采用的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号