Combining labelled and unlabelled data in the design of pattern classification systems

Bogdan Gabrys; Lina Petrakieva

首页> 外文期刊>International Journal of Approximate Reasoning >Combining labelled and unlabelled data in the design of pattern classification systems

【24h】

Combining labelled and unlabelled data in the design of pattern classification systems

机译：在模式分类系统的设计中结合标记和未标记的数据

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

There has been much interest in applying techniques that incorporate knowledge from unlabelled data into a supervised learning system but less effort has been made to compare the effectiveness of different approaches and to analyse the behaviour of the learning system when using different ratios of labelled to unlabelled data. In this paper various methods for learning from labelled and unlabelled data are first discussed and categorised into one of three major groups: pre-labelling, post-labelling and semi-supervised approaches. Their generalised formal description and extensive experimental analysis is then provided. The experimental results show that when supported by unlabelled samples much less labelled data is generally required to build a classifier without compromising the classification performance. If only a very limited amount of labelled data is available the results based on random selection of labelled samples show high variability and the performance of the final classifier is more dependent on how reliable the labelled data samples are rather than use of additional unlabelled data. In response to this finding three types of static (one-step) selection methods guided by a clustering information and various options of allocating a number of samples within clusters and their distributions have been proposed and analysed. A significant improvement compared to the random selection of the labelled samples have been observed when using these selective sampling techniques.

机译：在应用将来自未标记数据的知识整合到有监督的学习系统中的技术方面，已经引起了很多兴趣，但是在使用不同比例的已标记数据与未标记数据的比率时，已经进行了较少的工作来比较不同方法的有效性并分析学习系统的行为。在本文中，首先讨论了从标记和未标记的数据中学习的各种方法，并将其分类为三个主要组之一：预标记，后标记和半监督方法。然后提供了它们的广义形式描述和广泛的实验分析。实验结果表明，在未标记样品的支持下，构建分类器而不损害分类性能通常需要更少的标记数据。如果只有非常有限数量的标记数据可用，则基于随机选择的标记样本的结果将显示高可变性，并且最终分类器的性能将更多地取决于标记数据样本的可靠性，而不是使用其他未标记数据。响应于该发现，已经提出并分析了由聚类信息指导的三种类型的静态（单步）选择方法以及在聚类中分配多个样本及其分布的各种选择。使用这些选择性采样技术时，已观察到与标记样本的随机选择相比有显着改善。

著录项

来源
《International Journal of Approximate Reasoning》 |2004年第3期|p.251-273|共23页
作者
Bogdan Gabrys; Lina Petrakieva;
展开▼
作者单位

Computational Intelligence Research Group, School of Design, Engineering and Computing, Bournemouth University, Poole House, Talbot Campus, Fern Barrow, Poole, BH12 5BB, UK;

展开▼
收录信息美国《科学引文索引》(SCI);
原文格式 PDF
正文语种 eng
中图分类高等数学;
关键词
combined learning methods; supervised learning; unsupervised learning; semi-supervised clustering; pattern classification; random selection; preliminary selection;

机译：组合学习方法;有监督学习;无监督学习;半监督聚类;模式分类;随机选择;初步选择;

相似文献

外文文献
中文文献
专利

1. Dominant local binary patterns for texture classification: Labelled or unlabelled? [J] . Bianconi Francesco, Gonzalez Elena, Fernandez Antonio Pattern recognition letters . 2015,第NOVa1期

机译：用于纹理分类的主要局部二进制模式：带标签还是无标签？
2. On the Use of Labelled and Unlabelled Data to Improve Nearest Neighbor Classification [J] . F. Vázquez, J.S. Sánchez, F. Pla Inteligencia Artificial : Ibero-American Journal of Artificial Intelligence . 2006,第31期

机译：关于使用标记的和未标记的数据改善最近邻分类的方法
3. Research on classification and influencing factors of metro commuting patterns by combining smart card data and household travel survey data [J] . Ji Yanjie, Cao Yu, Liu Yang, Intelligent Transport Systems, IET . 2019,第10期

机译：智能卡数据与家庭出行调查数据相结合的地铁通勤方式分类及影响因素研究
4. Intimate Learning: A Novel Approach for Combining Labelled and Unlabelled Data [C] . Zhongmin Shi, Anoop Sarkar International Joint Conference on Artificial Intelligence . 2007

机译：亲密学习：一种结合标记和未标记数据的新方法
5. Adaptive classifier design using labelled and unlabelled data. [D] . Krishnapuram, Balaji. 2004

机译：使用标记和未标记数据的自适应分类器设计。
6. Spatial Patterns of Decreased Cerebral Blood Flow and Functional Connectivity in Multiple System Atrophy (Cerebellar-Type): A Combined Arterial Spin Labeling Perfusion and Resting State Functional Magnetic Resonance Imaging Study [O] . Weimin Zheng, Shan Ren, Hao Zhang, 2010

机译：多系统萎缩（小脑型）的脑血流量和功能连接性下降的空间格局：结合动脉自旋标记灌注和静息状态功能磁共振成像研究。
7. Combining labelled and unlabelled data in the design of pattern classification systems [O] . Gabrys Bogdan, Petrakieva Lina 2004

机译：在模式分类系统的设计中结合标记和未标记的数据

Combining labelled and unlabelled data in the design of pattern classification systems

摘要

著录项

相似文献

相关主题

期刊订阅