首页> 外文OA文献 >Complex data analytics via sparse, low-rank matrix approximation
【2h】

Complex data analytics via sparse, low-rank matrix approximation

机译:通过稀疏,低秩矩阵近似的复杂数据分析

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Today, digital data is accumulated at a faster than ever speed in science, engineering, biomedicine, and real-world sensing. Data mining provides us an effective way for the exploration and analysis of hidden patterns from these data for a broad spectrum of applications. Usually, these datasets share one prominent characteristic: tremendous in size with tens of thousands of objects and features. In addition, data is not only collected over a period of time, but the relationship between data points can change over that period too. Besides, knowledge is very sparsely encoded because the patterns are usually active only in a local area. The ubiquitous phenomenon of massive, dynamic, and sparse data imposes considerable challenges in data mining research. Recently, techniques that can expand the human ability to comprehendlarge-scale data have attracted significant attention in the research community.In this dissertation, we present approaches to solve the problems of complex data analysis in various applications. Specifically, we have achieved the following: 1) we develop Exemplar-based low-rank sparse Matrix Decomposition (EMD), a novel method for fast clustering large-scale data by incorporating low-rank approximations into matrix decomposition-based clustering; 2) we propose ECKF, a general model for large-scale Evolutionary Clustering based on low-rank Kernel matrix Factorization; by monitoring the low-rank approximation errors at every time step, ECKF can analyze if the underlying structure of the data or the nature of the relationship between the data points has changed over different time steps; based on this, a decision to either succeed the previous clustering or perform a new clustering is made; 3) we propose a Multi-level Low-rank Approximation (MLA) framework for fast spectral clustering, which is empirically shown to cluster large-scale data very efficiently; 4) we extend the MLA framework with a non-linear kernel and apply it to HD image segmentation; with sufficient data samples selected by fast sampling strategy, our method shows superior performance compared with other leading approximate spectral clusterings; 5) we develop a fast algorithm to detect abnormal crowd behavior in surveillance videos by employing low-rank matrix approximations to model crowd behavior patterns; through experiments performed on simulation crowdvideos, we demonstrate the effectiveness of our method.
机译:如今,在科学,工程,生物医学和现实感测中,数字数据的存储速度比以往任何时候都要快。数据挖掘为我们提供了一种有效的方法,可以从这些数据中探索和分析隐藏模式,以进行广泛的应用。通常,这些数据集具有一个突出的特征:庞大的数据量具有成千上万个对象和特征。另外,数据不仅是在一段时间内收集的,而且数据点之间的关系也可以在该时间段内改变。此外,知识的编码非常稀疏,因为这些模式通常仅在局部区域有效。大量,动态和稀疏数据的普遍现象给数据挖掘研究带来了巨大挑战。近年来,扩展人类理解大规模数据能力的技术引起了研究界的广泛关注。本文提出了解决各种应用中复杂数据分析问题的方法。具体而言,我们实现了以下目标:1)我们开发了基于样本的低秩稀疏矩阵分解(EMD),这是一种通过将低秩逼近合并到基于矩阵分解的聚类中来快速聚类大规模数据的新方法; 2)我们提出了ECKF,它是基于低秩Kernel矩阵分解的大规模进化聚类的通用模型;通过监视每个时间步长的低秩逼近误差,ECKF可以分析数据的基础结构或数据点之间关系的性质是否在不同的时间步长上发生了变化;基于此,决定是继续先前的聚类还是执行新的聚类。 3)我们提出了一种用于快速频谱聚类的多层低秩近似(MLA)框架,根据经验,该框架可以非常有效地对大规模数据进行聚类; 4)我们使用非线性内核扩展MLA框架,并将其应用于高清图像分割;通过快速采样策略选择了足够的数据样本,与其他领先的近似光谱聚类相比,我们的方法显示出优越的性能; 5)我们开发了一种快速算法,可通过使用低秩矩阵近似对人群行为模式进行建模来检测监视视频中的异常人群行为;通过在模拟人群视频上进行的实验,我们证明了该方法的有效性。

著录项

  • 作者

    Wang Lijun;

  • 作者单位
  • 年度 2012
  • 总页数
  • 原文格式 PDF
  • 正文语种
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号