首页> 外文学位 >Machine learning with incomplete information.
【24h】

Machine learning with incomplete information.

机译:信息不完整的机器学习。

获取原文
获取原文并翻译 | 示例

摘要

Machine learning algorithms detects patterns, regularities, and rules from the training data and adjust program actions accordingly. For example, when a learner (a computer program) sees a set of patient cases (patient records) with corresponding diagnoses, it can predict the presence of a disease for future patients. A somewhat unrealistic assumption in typical machine learning applications is that data is freely available. In my dissertation, I will present our research efforts to mitigate this assumption in the areas of active machine learning and budgeted machine learning.;In the area of active machine learning under the setting the labels of the instances have to be purchased, it is often assumed that there exists a perfect labeler labeling the chosen instances in the active machine learning setting. However it is possible that the labeler is not perfect, or it is possible there exists multiple noisy labelers with different known costs and different unknown accuracies, such as the Amazon Mechanical Turk. I will present our algorithms and experimental results of active learning from multiple noisy labelers with varied costs, which are based on ranking the labelers according to their estimated accuracies and costs. The experimental results show that our algorithms outperform those algorithms in the literature.;In the area of budgeted machine learning under the setting that the class label of every instance is known while the feature values of the instances have to be purchased at a cost, subject to an overall budget, the challenge to the learner is to decide which attributes of which instances will provide the best model from which to learn. I will present our budgeted learning algorithms of naive Bayes. Most of our algorithms perform well compared to existing algorithms in the literature. I will also present our algorithms for this budgeted learning of Bayesian network, which is a generalization of naive Bayes. Experimental results show that some of our algorithms outperform those algorithms in the literature.
机译:机器学习算法从训练数据中检测模式,规律性和规则,并相应地调整程序动作。例如,当学习者(计算机程序)看到具有相应诊断的一组患者病例(患者记录)时,它可以为将来的患者预测疾病的存在。在典型的机器学习应用程序中,一个不切实际的假设是数据是免费可用的。在我的论文中,我将介绍我们在主动机器学习和预算机器学习领域中为减轻这种假设所做的研究工作。在必须购买实例标签的情况下,在主动机器学习领域中,通常假设在活动的机器学习设置中存在一个完美的标记器,用于标记所选实例。但是,贴标机可能不完美,或者可能存在多个噪声不同的贴标机,它们具有不同的已知成本和不同的未知精度,例如Amazon Mechanical Turk。我将介绍我们从具有不同成本的多个嘈杂标签商主动学习的算法和实验结果,这些方法和方法是基于根据标签的估计准确性和成本对标签进行排名的。实验结果表明,我们的算法优于文献中的算法。;在预算机器学习领域中,每个实例的类标签都是已知的,而实例的特征值则必须付费购买,主题是对于总体预算而言,学习者面临的挑战是确定哪些实例的哪些属性将提供学习的最佳模型。我将介绍我们朴素的贝叶斯预算算法。与文献中的现有算法相比,我们的大多数算法性能良好。我还将介绍用于预算贝叶斯网络学习的算法,这是朴素贝叶斯的概括。实验结果表明,我们的某些算法优于文献中的那些算法。

著录项

  • 作者

    Zheng, Yaling.;

  • 作者单位

    The University of Nebraska - Lincoln.;

  • 授予单位 The University of Nebraska - Lincoln.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 215 p.
  • 总页数 215
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号