Multi-label classification via incremental clustering on an evolving data stream

Tien Thanh Nguyen; Manh Truong Dang; Anh Vu Luong; Liew Alan Wee-Chung; Liang Tiancai; McCall John

首页> 外文期刊>Pattern Recognition: The Journal of the Pattern Recognition Society >Multi-label classification via incremental clustering on an evolving data stream

【24h】

Multi-label classification via incremental clustering on an evolving data stream

机译：通过在不断发展的数据流上的增量聚类来多标签分类

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

With the advancement of storage and processing technology, an enormous amount of data is collected on a daily basis in many applications. Nowadays, advanced data analytics have been used to mine the collected data for useful information and make predictions, contributing to the competitive advantages of companies. The increasing data volume, however, has posed many problems to classical batch learning systems, such as the need to retrain the model completely with the newly arrived samples or the impracticality of storing and accessing a large volume of data. This has prompted interest on incremental learning that operates on data streams. In this study, we develop an incremental online multi-label classification (OMLC) method based on a weighted clustering model. The model is made to adapt to the change of data via the decay mechanism in which each sample's weight dwindles away over time. The clustering model therefore always focuses more on newly arrived samples. In the classification process, only clusters whose weights are greater than a threshold (called mature clusters) are employed to assign labels for the samples. In our method, not only is the clustering model incrementally maintained with the revealed ground truth labels of the arrived samples, the number of predicted labels in a sample are also adjusted based on the Hoeffding inequality and the label cardinality. The experimental results show that our method is competitive compared to several well-known benchmark algorithms on six performance measures in both the stationary and the concept drift settings. (C) 2019 Elsevier Ltd. All rights reserved.

机译：随着存储和处理技术的进步，在许多应用中每天都会收集大量数据。如今，高级数据分析已被用于挖掘收集的数据以获取有用的信息并进行预测，有助于公司的竞争优势。然而，数据量的增加对古典批量学习系统构成了许多问题，例如需要与新近到达的样本或存储和访问大量数据的不切实性来重写模型的需要。这促使对数据流运行的增量学习兴趣。在本研究中，我们在基于加权群集模型中开发一个增量的在线多标签分类（OMLC）方法。通过衰变机制使模型适应数据的变化，其中每个样本的重量随着时间的推移而被延迟。因此，聚类模型始终更多地关注新到达的样本。在分类过程中，仅采用重量大于阈值（称为成熟集群）的群集来为样本分配标签。在我们的方法中，不仅具有到达样本的显示的地面真理标签的逐步维护的聚类模型，还基于Hoeffd的不平等和标签基数来调整样本中的预测标签的数量。实验结果表明，与静统计和概念漂移设置中的六种性能措施相比，我们的方法与若干知名的基准算法相比是竞争力的。（c）2019年elestvier有限公司保留所有权利。

著录项

来源
《Pattern Recognition: The Journal of the Pattern Recognition Society》 |2019年第2019期|共18页
作者
Tien Thanh Nguyen; Manh Truong Dang; Anh Vu Luong; Liew Alan Wee-Chung; Liang Tiancai; McCall John;
展开▼
作者单位

Robert Gordon Univ Sch Comp Sci &

Digital Media Aberdeen Scotland;

Robert Gordon Univ Sch Comp Sci &

Digital Media Aberdeen Scotland;

Griffith Univ Sch Informat &

Commun Technol Nathan Qld Australia;

Griffith Univ Sch Informat &

Commun Technol Nathan Qld Australia;

GRGBanking Technol Co Ltd Guangzhou Guangdong Peoples R China;

Robert Gordon Univ Sch Comp Sci &

Digital Media Aberdeen Scotland;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Multi-label classification; Incremental learning; Online learning; Clustering; Data stream; Concept drift;

机译：多标签分类;增量学习;在线学习;聚类;数据流;概念漂移;

相似文献

外文文献
中文文献
专利

1. Multi-label classification via incremental clustering on an evolving data stream [J] . Tien Thanh Nguyen, Manh Truong Dang, Anh Vu Luong, Pattern Recognition: The Journal of the Pattern Recognition Society . 2019,第期

机译：通过在不断发展的数据流上的增量聚类来多标签分类
2. Incremental density-based ensemble clustering over evolving data streams [J] . Khan Imran, Huang Joshua Z., Ivanov Kamen Neurocomputing . 2016,第maya26期

机译：在不断发展的数据流上基于增量密度的集成聚类
3. Data Stream Classification by Dynamic Incremental Semi-Supervised Fuzzy Clustering [J] . Casalino Gabriella, Castellano Giovanna, Mencar Corrado International Journal of Artificial Intelligence Tools: Architectures, Languages, Algorithms . 2019,第8期

机译：通过动态增量半监控模糊群集数据流分类
4. Efficient class incremental learning for multi-label classification of evolving data streams [C] . Shi Zhongwei, Xue Yun, Wen Yimin, International Joint Conference on Neural Networks . 2014

机译：高效的课堂增量学习，可对不断发展的数据流进行多标签分类
5. An effective evolving data stream classification. [D] . Al-khateeb, Tahseen M. 2012

机译：有效的不断发展的数据流分类。
6. Streaming chunk incremental learning for class-wise data stream classification with fast learning speed and low structural complexity [O] . Prem Junsawang, Suphakant Phimoltares, Chidchanok Lursinsap 2012

机译：流式块增量学习，用于以快速的学习速度和较低的结构复杂度对类数据流进行分类
7. Multi-label classification via incremental clustering on an evolving data stream [O] . Tien Thanh Nguyen, Manh Truong Dang, Anh Vu Luong, 2019

机译：通过增量聚类在不断发展的数据流上的多标签分类

Multi-label classification via incremental clustering on an evolving data stream

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅