首页> 外文OA文献 >Attribute scoring based on performance of an learning algorithm on samples of attribute space
【2h】

Attribute scoring based on performance of an learning algorithm on samples of attribute space

机译:在属性空间样本上基于学习算法性能的属性评分

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In the field of machine learning and knowledge discovery in databases attributes or features have a central role, thus it is reasonable to also question their quality and importance for the given problem. Because this is in general a difficult problem, we focused in the thesis on the development of a new method for estimating attribute importance.ududThe new method is based on sampling the attribute space, evaluating the performance of algorithms for machine learning and reasoning about the importance of individual attributes based on the obtained scores. More specifically, at first different combinations of attributes are chosen and smaller data sets that contain them are prepared on which a testing procedure with sampling obtains estimates on performance of an arbitrary chosen learning algorithm. Performance estimates obtained that way are statistically processed for each attribute according to their presence and with a given formula joined into final scores for individual attributes.ududIn order to determine how well different variants of the new method work, an appropriate experimental methodology and many diverse data sets has been prepared. Some successful methods have also been further tested in more detail to reinforce the conclusion, that certain variants of the new method really are statistically significant better than conventional widely used methods for this problem, but unfortunately an improved version of the best one of them still seems to be better. The thesis concludes with a discussion of the results and various ideas for further work, improvements and applications of the method.ud
机译:在数据库中的机器学习和知识发现领域,属性或特征具有核心作用,因此合理地质疑它们的质量和对于给定问题的重要性也是合理的。由于这通常是一个难题,因此我们将重点放在了一种用于估计属性重要性的新方法上。 ud ud该新方法基于对属性空间进行采样,评估用于机器学习和推理的算法的性能根据获得的分数了解各个属性的重要性。更具体地说,首先,选择属性的不同组合,并准备包含它们的较小数据集,在该数据集上进行采样的测试过程将获得对任意选定学习算法性能的估计。根据每种属性的存在情况,对每种属性进行统计处理,并使用给定公式将各个属性的最终得分相结合。 ud ud为了确定新方法的不同变体的效果如何,应采用适当的实验方法和已经准备了许多不同的数据集。还对一些成功的方法进行了更详细的测试,以证实这一结论:新方法的某些变体在统计上确实比传统的广泛使用的方法在统计学上要好得多,但不幸的是,最好的方法的改进版本似乎仍然存在。变得更好。本文最后对结果进行了讨论,并提出了进一步研究,改进和应用该方法的各种思路。 ud

著录项

  • 作者

    Weiss Gregor;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"sl","name":"Slovene","id":39}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号