首页> 外文会议>Insternational Joint Conference on Natural Language Processing >A Comparative Study on the Use of Labeled and Unlabeled Data for Large Margin Classifiers

【24h】

A Comparative Study on the Use of Labeled and Unlabeled Data for Large Margin Classifiers

机译：大型保证金分类器使用标记和未标记数据的比较研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose to use both labeled and unlabeled data with the Expectation-Maximization (EM) algorithm in order to estimate the generative model and use this model to construct a Fisher kernel. The Naive Bayes generative probability is used to model a document. Through the experiments of text categorization, we empirically show that, (a) the Fisher kernel with labeled and unlabeled data outperforms Naive Bayes classifiers with EM and other methods for a sufficient amount of labeled data, (b) the value of additional unlabeled data diminishes when the labeled data size is large enough for estimating a reliable model, (c) the use of categories as latent variables is effective, and (d) larger unlabeled training datasets yield better results.

机译：我们建议使用标记和未标记的数据与期望 - 最大化（EM）算法，以估计生成模型并使用此模型来构建Fisher内核。 Naive Bayes生成概率用于模拟文档。通过对文本分类的实验，我们经验证明，（a）与标记和未标记的数据的Fisher内核优于Naive Bayes分类器，以足够量的标记数据，（b）附加未标记数据的价值减少当标记的数据大小足够大时足以估计可靠的模型，（c）类别的使用作为潜在变量是有效的，并且（d）更大的未标记训练数据集产生更好的结果。

著录项

来源
《Insternational Joint Conference on Natural Language Processing 》|2004年||共6页
会议地点
作者
Hiroya Takamura; Manabu Okumura; Association for Computational Linguistics(ACL); Association for Computational Linguistics and Chinese Language Processing(ACLCLP); Association of Natural Language Processing(ANLP);
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类程序语言、算法语言 ;
关键词

相似文献

外文文献
中文文献
专利

1. A comparative study of the use of large margin classifiers on seismic data [J] . Krystallenia Drosou, Andreas Artemiou, Christos Koukouvinos Journal of applied statistics . 2015 ,第1a2期

机译：在地震数据上使用大余量分类器的比较研究
2. Comparative studies on radiolytic degradation of deuterium labeled and unlabeled tributyl phosphates [J] . Shikha Sharma, Sunil K. Ghosh, Devidas B. Naik, Journal of Radioanalytical and Nuclear Chemistry: An International Journal Dealing with All Aspects and Applications of Nuclear Chemistry . 2014 ,第1期

机译：氘标记和未标记的磷酸三丁酯的辐射降解比较研究
3. NONPARALLEL HYPERPLANES PROXIMAL CLASSIFIERS BASED ON MANIFOLD REGULARIZATION FOR LABELED AND UNLABELED EXAMPLES [J] . ZHI-XIA YANG International Journal of Pattern Recognition and Artificial Intelligence . 2013 ,第5期

机译：基于流形调整的非并行超平面近似分类器，用于带标签和无标签的示例
4. A Comparative Study on the Use of Labeled and Unlabeled Data for Large Margin Classifiers [C] . International Joint Conference on Natural Language Processing . 2005

机译：大型保证金分类器用标记和未标记数据的使用比较研究
5. Logic Knowledge Base Refinement Using Unlabeled or Limited Labeled Data. [D] . Chan, Ki Cecia. 2010

机译：使用未标记或受限标记的数据进行逻辑知识库优化。
6. Research and applications: Breast cancer survivability prediction using labeled unlabeled and pseudo-labeled patient data [O] . Juhyeon Kim, Hyunjung Shin 2013

机译：研究与应用：使用标记的未标记的和伪标记的患者数据预测乳腺癌的存活率
7. Learning from Labeled and Unlabeled Documents: A Comparative Study on Semi-Supervised Text Classification [O] . Carsten Lanquillon 2000

机译：从标签和未标记的文件学习：半监督文本分类的比较研究
8. Using EM to Classify Text from Labeled and Unlabeled Documents [R] . Nigam, K. , McCallum, A. , Thrun, S. , 1998

机译：使用Em从标记和未标记文档中分类文本

A Comparative Study on the Use of Labeled and Unlabeled Data for Large Margin Classifiers

摘要

著录项

相似文献

相关主题

期刊订阅