首页> 外文会议>IEEE International Conference on High Performance Computing and Communications >Elicitation of Candidate Subspaces in High-Dimensional Data
【24h】

Elicitation of Candidate Subspaces in High-Dimensional Data

机译:高维数据中候选子空间的引出

获取原文

摘要

Anomaly detection is an important research area in data mining and has been studied intensively in recent years. The increasing number of features, non-informative noises and other irrelevant features, makes it challenging to detect anomalies in high-dimensional data. When analyzing high-dimensional data, anomalies are difficult to identify due to the sparsity caused by the curse of dimensionality. One most commonly used algorithm for reducing dimensionality is Principal Component Analysis (PCA), however, it is known to be sensitive in identifying anomalies. Furthermore, anomalies are rare and only can be found in low-dimensional subspaces. To effectively find anomalies in the high-dimensional space, we propose a technique that explores locally relevant and low-dimensional subspaces where anomalies may be possibly hidden due to the sparsity caused by the curse of dimensionality, and we call these subspaces as candidate subspaces for anomalies. In particular, the proposed technique integrates a Pearson Correlation Coefficient (PCC) and PCA, thereby combining the highest variances. Our experimental results showed that the technique gives good results when anomalies are synthetically introduced.
机译:异常检测是数据挖掘中的一个重要研究领域,近年来已经深入研究。越来越多的特征,非信息性噪音和其他无关的特征,使得在高维数据中检测异常来挑战。在分析高维数据时,由于由维度诅咒引起的稀疏性,异常难以识别。用于减少维度的最常用算法是主要成分分析(PCA),然而,已知在识别异常方面是敏感的。此外,异常是罕见的,只能在低维子空间中找到。为了有效地发现高维空间中的异常,我们提出了一种探讨了局部相关和低维子空间的技术,其中由于维度诅咒引起的稀疏性,可能会隐藏异常,并且我们将这些子空间称为候选子空间异常。特别地,所提出的技术集成了Pearson相关系数(PCC)和PCA,从而组合了最高差异。我们的实验结果表明,在综合引入异常时,该技术会产生良好的效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号