首页> 外文学位 >Exploitation of unlabeled data and related tasks in semi-supervised learning.

【24h】

Exploitation of unlabeled data and related tasks in semi-supervised learning.

机译：在半监督学习中利用未标记的数据和相关任务。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Supervised learning has proven an effective technique for learning a classifier when there is enough labeled data. Unfortunately, in many applications, a generous provision of labeled data is often not available due to the high cost of labeling a datum. Supervised algorithms are known to generalize poorly when there is a limited number of labeled data. There has been much recent work on semi-supervised learning and multitask learning; both try to improve the generalization of classifiers based on using information sources beyond the labeled data.;In this thesis, we design two semi-supervised algorithms, termed as parameterized neighborhood-based classification (PNBC) and label iteration, that efficiently explore the data manifold information provided by both the labeled data and unlabeled data, to improve generalization. The PNBC represents the probability of label at a given data point by mixing over all data points in a neighborhood, which is formed via a Markov random walk over the entire data manifold. The label iteration is a very simple algorithm, which has a closed-form solution in the limit. Experimental results demonstrate the effectiveness of both algorithms. Based on PNBC, we further propose an efficient active learning procedure for the unexploded ordnance (UXO) detection problem, employing the mutual-information criterion.;With PNBC as a building block, we make the first attempt to integrate the benefits offered both by semi-supervised learning and multitask learning (MTL), by proposing semi-supervised multitask learning. In the semi-supervised MTL setting, we have M partially labeled data manifolds, each defining a classification task and involving design of a PNBC classifier. The M PNBC classifiers are designed simultaneously within a unified sharing structure. The superior performance of semi-supervised MTL on real sensing applications demonstrates that both manifold information and the information from related tasks could play positive and complementary roles in real applications, suggesting that one can find significant benefits in practice by performing semi-supervised MTL.

机译：当有足够的标记数据时，监督学习已被证明是一种学习分类器的有效技术。不幸的是，由于标注数据的高昂成本，在许多应用中，通常无法提供大量的标注数据。当标记数据数量有限时，已知监督算法的泛化能力很差。最近有很多关于半监督学习和多任务学习的工作。两者都试图通过利用标记数据以外的信息源来提高分类器的泛化能力。本文设计了两种半监督算法，分别称为基于参数化邻域分类（PNBC）和标签迭代，可以有效地探索数据。标记数据和未标记数据同时提供的多种信息，以提高通用性。 PNBC通过在附近的所有数据点上混合来表示给定数据点处的标记概率，这是通过整个数据流形上的马尔可夫随机游走形成的。标签迭代是一种非常简单的算法，其极限值具有封闭形式的解决方案。实验结果证明了两种算法的有效性。在PNBC的基础上，我们进一步提出了一种有效的主动学习程序，利用相互信息标准对未爆炸弹药（UXO）检测问题进行了研究;;以PNBC为基础，我们首次尝试将半成品所提供的好处整合在一起通过提出半监督多任务学习来实现监督学习和多任务学习（MTL）。在半监督MTL设置中，我们有M个带有部分标签的数据流形，每个流形定义一个分类任务并涉及PNBC分类器的设计。 M PNBC分类器在统一的共享结构中同时设计。半监督MTL在实际感测应用程序中的优越性能表明，多种信息和来自相关任务的信息都可以在实际应用程序中发挥积极和互补的作用，这表明通过执行半监督MTL可以在实践中找到显着的收益。

著录项

作者
Liu, Qiuhua.;
展开▼
作者单位

Duke University.;

展开▼
授予单位 Duke University.;
学科 Engineering Electronics and Electrical.
学位 Ph.D.
年度 2007
页码 98 p.
总页数 98
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Semi-supervised text categorization: Exploiting unlabeled data using ensemble learning algorithms [J] . Mohammad Reza Keyvanpour, Maryam Bahojb Imani Intelligent data analysis . 2013,第3期

机译：半监督文本分类：使用集成学习算法开发未标记的数据
2. SEMI-SUPERVISED LEARNING: EXPLOITING UNLABELED DATA WITH SYMMETRICAL DISTRIBUTION AND HIGH CONFIDENCE [J] . YIHAO ZHANG, JUNHAO WEN, FANGFANG TANG, International Journal of Pattern Recognition and Artificial Intelligence . 2012,第7期

机译：半监督的学习：利用对称分布和高置信度来探索无法标记的数据
3. Semi-supervised multi-class Adaboost by exploiting unlabeled data [J] . Enmin Song, Dongshan Huang, Guangzhi Ma, Expert Systems with Application . 2011,第6期

机译：通过利用未标记的数据进行半监督的多类Adaboost
4. Semi-Supervised Learning by Exploiting Unlabeled Data Correlations in a Dual-Branch Network [C] . Jie Ling, Meng Yang IEEE International Conference on Multimedia and Expo . 2021

机译：通过在双分支网络中利用未标记的数据相关性进行半监督学习
5. Exploiting Application Characteristics for Efficient System Support of Data-Parallel Machine Learning. [D] . Cui, Henggang. 2017

机译：利用应用程序特性为数据并行机器学习提供有效的系统支持。
6. A Semi-Supervised Approach to Bearing Fault Diagnosis under Variable Conditions towards Imbalanced Unlabeled Data [O] . Xinan Chen, Zhipeng Wang, Zhe Zhang, 2018

机译：可变条件下针对不平衡未标记数据的轴承故障诊断的半监督方法
7. A novel semi-supervised data-driven method for chiller fault diagnosis with unlabeled data [O] . Bingxu Li, Fanyong Cheng, Xin Zhang, 2021

机译：一种新型半监控数据驱动方法，用于未标记数据的冷却器故障诊断
8. Techniques for Exploiting Unlabeled Data [R] . Rwebangira, M. R. 2008

机译：利用未标记数据的技术

Exploitation of unlabeled data and related tasks in semi-supervised learning.

摘要

著录项

相似文献

相关主题

期刊订阅