首页> 外文学位 >Extended Bayes and skewing: On two improvements to standard induction-based learning algorithms.

【24h】

Extended Bayes and skewing: On two improvements to standard induction-based learning algorithms.

机译：扩展的贝叶斯和偏斜：基于标准归纳学习算法的两个改进。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We address improvements to Naive Bayes (NB) and Decision Trees, two standard induction-based methods for solving classification problems. The goal of these improvements is to extract more information from the training examples, in order to more accurately classify new examples.; The first part of this thesis presents a new learning algorithm, Extended Bayes (EB), which is an extension of NB. NB classifies new examples using conditional probabilities computed from the training data. It is simple, fast, and widely applicable. EB retains these positive properties of NB, while equaling or surpassing the predictive power of NB as measured on a wide variety of benchmark UC-Irvine datasets. EB is based on two ideas, which interact. The first is to find sets of seemingly dependent attributes and to add them as new attributes. The second is to exploit "zeroes", i.e., the negative evidence provided by attribute values that do not occur in particular classes in the training data. Zeroes are handled in Naive Bayes by smoothing (substituting a small positive value). In contrast, EB uses them as evidence that a potential class labeling may be wrong.; The second part of the thesis presents a theoretical analysis of skewing, a recent technique for improving the performance of standard decision tree algorithms [42]. Decision tree algorithms use the training data to build a decision tree that computes a function mapping examples to class labels. Standard decision tree algorithms perform poorly in learning certain "difficult" functions, such as parity, when irrelevant attributes are present, because of an inability to distinguish between relevant and irrelevant attributes. While experimental evidence indicates that skewing can remedy this problem, prior to the work in this thesis, there was almost no analysis of when and why skewing worked. We prove that, in an idealized setting, skewing can always identify relevant attributes. We also present an analysis of a variant of skewing called sequential skewing, and prove results concerning properties of the class of "difficult" functions.

机译：我们解决了朴素贝叶斯（NB）和决策树这两种基于归纳的标准方法用于解决分类问题的改进问题。这些改进的目的是从培训示例中提取更多信息，以便更准确地对新示例进行分类。本文的第一部分提出了一种新的学习算法，扩展贝叶斯（EB），它是NB的扩展。 NB使用从训练数据计算出的条件概率对新示例进行分类。它简单，快速且广泛适用。 EB保留了NB的这些积极特性，而在各种基准UC-Irvine数据集上测得的NB等于或超过NB的预测能力。 EB基于两个相互影响的想法。首先是查找看似相关的属性集，并将其添加为新属性。第二种是利用“零”，即由属性值提供的否定证据在训练数据的特定类别中不会出现。在朴素贝叶斯中，通过平滑处理零（代入一个小的正值）。相反，EB使用它们作为证据表明潜在的类别标签可能是错误的。论文的第二部分介绍了偏斜的理论分析，这是一种用于提高标准决策树算法性能的最新技术[42]。决策树算法使用训练数据来构建决策树，该决策树计算将功能映射到类标签的函数。当存在无关属性时，标准决策树算法在学习某些“困难”功能（如奇偶校验）时表现不佳，因为无法区分相关属性和无关属性。尽管实验证据表明，偏斜可以解决这个问题，但在本文进行工作之前，几乎没有分析过何时以及为什么偏斜起作用。我们证明，在理想的设置中，倾斜总是可以识别相关属性。我们还介绍了一种称为顺序偏斜的偏斜变体，并证明了与“难”函数类的性质有关的结果。

著录项

作者
Rosell, Bernard.;
展开▼
作者单位

Polytechnic University.;

展开▼
授予单位 Polytechnic University.;
学科 Artificial Intelligence.; Computer Science.
学位 Ph.D.
年度 2005
页码 88 p.
总页数 88
原文格式 PDF
正文语种 eng
中图分类人工智能理论;自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Empirical Bayes and Hierarchical Bayes Estimation of Skew Normal Populations [J] . NAVEEN K. BANSAL, MEHDI MAADOOLIAT, XIAOWEI WANG Communications in Statistics. A, Theory and Methods . 2008,第6a7期

机译：经验贝叶斯和分层贝叶斯估计偏斜正常人群
2. Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory [J] . Ron Amit, Ron Meir JMLR: Workshop and Conference Proceedings . 2018,第2010期

机译：基于扩展PAC-贝叶斯理论的调整先验元学习
3. Meta-Learning by Adjusting Priors Based on Extended PAC-Bayes Theory [J] . Ron Amit, Ron Meir JMLR: Workshop and Conference Proceedings . 2018,第4期

机译：基于扩展PAC-贝叶斯理论的调整先验元学习
4. Learning Link-Based Naieve Bayes Classifiers from Ontology-Extended Distributed Data [C] . Cornelia Caragea, Doina Caragea, Vasant Honavar OTM 2009;Confederated international conferences CoopIS, DOA, IS, and ODBASE 2009 . 2009

机译：从本体扩展的分布式数据中学习基于链接的Naieve Bayes分类器
5. Extending a Priori Procedure for Estimating Parameters Under Skew Normal Settings [D] . Wang, Cong. 2020

机译：扩展了先验过程，用于估算偏斜正常设置下的参数
6. Pilot study of a cluster randomised trial of a guided e-learning health promotion intervention for managers based on management standards for the improvement of employee well-being and reduction of sickness absence: GEM Study [O] . Stephen A Stansfeld, Sally Kerry, Tarani Chandola, 2015

机译：GEM研究基于管理标准的针对管理人员的指导性电子学习健康促进干预措施的整群随机试验的先导研究：GEM研究
7. Empirical Bayes and Hierarchical Bayes Estimation of Skew Normal Populations [O] . Naveen K. Bansal, Mehdi Maadooliat, Xiaowei Wang 2008

机译：经验贝叶斯和分层贝叶斯估计偏斜正常人群

Extended Bayes and skewing: On two improvements to standard induction-based learning algorithms.

摘要

著录项

相似文献

相关主题

期刊订阅