Learning Classification with Both Labeled and Unlabeled Data

机译：使用标记和未标记数据学习分类

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

A key difficulty for applying machine learning classification algorithms for many applications is that they require a lot of hand-labeled examples. Labeling large amount of data is a costly process which in many cases is prohibitive. In this paper we show how the use of a small number of labeled data together with a large number of unla-beled data can create high-accuracy classifiers. Our approach does not rely on any parametric assumptions about the data as it is usually the case with generative methods widely used in semi-supervised learning. We propose new discriminant algorithms handling both labeled and unlabeled data for training classification models and we analyze their performances on different information access problems ranging from text span classification for text summarization to e-mail spam detection and text classification.

机译：将机器学习分类算法应用于许多应用程序的一个关键困难是它们需要大量手工标记的示例。标记大量数据是一个昂贵的过程，在许多情况下是禁止的。在本文中，我们展示了如何使用少量标记数据以及大量无标签数据可以创建高精度分类器。我们的方法不依赖于任何关于数据的参数假设，因为在半监督学习中广泛使用的生成方法通常就是这种情况。我们提出了新的判别算法，用于训练分类模型，处理标记和未标记的数据，并分析它们在不同信息访问问题上的性能，这些问题涉及从文本跨度分类（用于文本摘要）到电子邮件垃圾邮件检测和文本分类。

著录项

来源
《13th European Conference on Machine Learning, Aug 19-23, 2002, Helsinki, Finland》|2002年|p.468-479|共12页
会议地点 Helsinki(FI);Helsinki(FI)
作者
Jean-Noeel Vittaut; Massih-Reza Amini; Patrick Gallinari;
展开▼
作者单位

Computer Science Laboratory of Paris 6 (LIP6), University of Pierre et Marie Curie 8 rue du capitaine Scott, 75015 Paris, France;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Learning model order from labeled and unlabeled data for partially supervised classification, with application to word sense disambiguation [J] . Zheng-Yu Niu, Dong-Hong Ji, Chew Lim Tan Computer speech and language . 2007,第4期

机译：从标记和未标记的数据中学习模型顺序以进行部分监督分类，并应用于词义消歧
2. Target Tracking and Classification from Labeled and Unlabeled Data in Wireless Sensor Networks [J] . Hyoun Jin Kim, Jaehyun Yoo Sensors . 2014,第12期

机译：无线传感器网络中带标签和无标签数据的目标跟踪和分类
3. Automatic Web Query Classification Using Labeled and Unlabeled Training Data [J] . Steven M. Beitzel, Eric C. Jensen, Ophir Frieder, ACM SIGIR FORUM . 2005,第Spe期

机译：使用标记的和未标记的训练数据进行自动Web查询分类
4. The Use of Unlabeled Data Versus Labeled Data for Stopping Active Learning for Text Classification [C] . Garrett Beatty, Ethan Kochis, Michael Bloodgood IEEE International Conference on Semantic Computing . 2019

机译：使用未标记的数据与标记的数据来停止主动学习以进行文本分类
5. Logic Knowledge Base Refinement Using Unlabeled or Limited Labeled Data. [D] . Chan, Ki Cecia. 2010

机译：使用未标记或受限标记的数据进行逻辑知识库优化。
6. Clinical Document Classification Using Labeled and Unlabeled Data Across Hospitals [O] . Hamed Hassanzadeh, Mahnoosh Kholghi, Anthony Nguyen, 2018

机译：跨医院使用标记和未标记数据的临床文件分类
7. The Use of Unlabeled Data Versus Labeled Data for Stopping Active Learning for Text Classification [O] . Garrett Beatty, Ethan Kochis, Michael Bloodgood 2019

机译：使用未标记的数据与标记数据，用于停止主动学习文本分类
8. Cognitive Study of Learning with Labeled and Unlabeled Data. [R] . Zhu, X., Rogers, T. T. 2012

机译：标记和未标记数据学习的认知研究。

Learning Classification with Both Labeled and Unlabeled Data

摘要

著录项

相似文献

相关主题

期刊订阅