首页> 外文OA文献 >Clustering Non-Stationary Data Streams with Online Deep Learning
【2h】

Clustering Non-Stationary Data Streams with Online Deep Learning

机译:使用在线深度学习对非平稳数据流进行聚类

摘要

With more devices connected, sensor data logged and people active in social networks, the trend towardsworking with dynamic data is clear. The number of applications where it becomes essential to perform real timeanalysis on data streams grows accordingly, each with its own challenges. From this area of data stream analysiswe benchmark the performance of current state of the art clustering algorithms: CluStream, DenStream andClusTree. We also adapt a Variational Autoencoder to perform in the context of non-stationary data streamsand assess its generative capabilities for dimensionality reduction. From this limited lab experiment we showthat while there is a significant improvement in the clustering accuracy of high dimensional datasets after adimensionality reduction with a Variational Autoencoder, not all clustering algorithms benefit in the sameway from it. Additionally we show that regardless of the clustering algorithm, no relevant improvement in thepurity of the clusters could be obtained after the dimensionality reduction.
机译:随着更多设备的连接,传感器数据的记录以及社交网络中活跃的人们,使用动态数据的趋势显而易见。因此,对数据流执行实时分析必不可少的应用程序数量相应增加,每个应用程序都有自己的挑战。从数据流分析的这一领域,我们对当前最先进的聚类算法:CluStream,DenStream和ClusTree的性能进行基准测试。我们还改编了变分自动编码器以在非平稳数据流的情况下执行,并评估了其降维能力。从这个有限的实验室实验中我们可以看出,使用变分自动编码器降低维数后,高维数据集的聚类精度有了显着提高,但并不是所有的聚类算法都能从中受益。另外,我们表明,无论采用哪种聚类算法,降维后都无法获得对聚类纯度的任何相关改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号