Incremental Sparse-PCA Feature Extraction For Data Streams

机译：数据流的增量式稀疏PCA特征提取

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Intruders attempt to penetrate commercial systems daily and cause considerable financial losses for individuals and organizations. Intrusion detection systems monitor network events to detect computer security threats. An extensive amount of network data is devoted to detecting malicious activities.Storing, processing, and analyzing the massive volume of data is costly and indicate the need to find efficient methods to perform network data reduction that does not require the data to be first captured and stored. A better approach allows the extraction of useful variables from data streams in real time and in a single pass. The removal of irrelevant attributes reduces the data to be fed to the intrusion detection system (IDS) and shortens the analysis time while improving the classification accuracy. This dissertation introduces an online, real time, data processing method for knowledge extraction.This incremental feature extraction is based on two approaches. First, Chunk Incremental Principal Component Analysis (CIPCA) detects intrusion in data streams. Then, two novel incremental feature extraction methods, Incremental Structured Sparse PCA (ISSPCA) and Incremental Generalized Power Method Sparse PCA (IGSPCA), find malicious elements. Metrics helped compare the performance of all methods.The IGSPCA was found to perform as well as or better than CIPCA overall in term of dimensionality reduction, classification accuracy, and learning time. ISSPCA yielded better results for higher chunk values and greater accumulation ratio thresholds. CIPCA and IGSPCA reduced the IDS dataset to 10 principal components as opposed to 14 eigenvectors for ISSPCA. ISSPCA is more expensive in terms of learning time in comparison to the other techniques.This dissertation presents new methods that perform feature extraction from continuous data streams to find the small number of features necessary to express the most data variance. Data subsets derived from a few important variables render their interpretation easier.Another goal of this dissertation was to propose incremental sparse PCA algorithms capable to process data with concept drift and concept shift. Experiments using WaveForm and WaveFormNoise datasets confirmed this ability. Similar to CIPCA, the ISSPCA and IGSPCA updated eigen-axes as a function of the accumulation ratio value, forming informative eigenspace with few eigenvectors.

机译：入侵者试图每天侵入商业系统，并给个人和组织造成可观的经济损失。入侵检测系统监视网络事件以检测计算机安全威胁。大量的网络数据专用于检测恶意活动。存储，处理和分析大量数据的成本很高，这表明需要找到有效的方法来执行网络数据缩减，而这种方法不需要先捕获和捕获数据。存储。一种更好的方法允许实时，单次从数据流中提取有用的变量。无关属性的删除减少了要馈送到入侵检测系统（IDS）的数据，并缩短了分析时间，同时提高了分类准确性。本文介绍了一种在线，实时，数据处理的知识提取方法。这种增量特征提取基于两种方法。首先，块增量主成分分析（CIPCA）检测数据流中的入侵。然后，两种新颖的增量特征提取方法，即增量结构稀疏PCA（ISSPCA）和增量广义幂方法稀疏PCA（IGSPCA），发现了恶意元素。度量标准有助于比较所有方法的性能。在减少维度，分类准确性和学习时间方面，IGSPCA的整体性能优于或优于CIPCA。对于更高的组块值和更高的累积比率阈值，ISSPCA产生了更好的结果。 CIPCA和IGSPCA将IDS数据集减少到10个主要成分，而不是ISSPCA的14个特征向量。与其他技术相比，ISSPCA在学习时间上更为昂贵。本文提出了从连续数据流中进行特征提取以发现表示最大数据差异所需的少量特征的新方法。由几个重要变量衍生的数据子集使它们的解释更加容易。本论文的另一个目标是提出一种增量式稀疏PCA算法，该算法能够处理带有概念漂移和概念偏移的数据。使用WaveForm和WaveFormNoise数据集的实验证实了这种能力。与CIPCA相似，ISSPCA和IGSPCA更新了特征轴作为累积比率值的函数，从而形成了信息量少的特征向量的特征空间。

著录项

作者
Nziga Jean-Pierre;
展开▼
作者单位

展开▼
年度 2015
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Semi-supervised incremental feature extraction algorithm forrnlarge-scale data stream [J] . Chao Tan, Genlin Ji Concurrency and computation: practice and experience . 2017,第6期

机译：大型数据流的半监督增量特征提取算法
2. HMM with improved feature extraction-based feature parameters for identity recognition of gesture command operators by using a sensed Kinect-data stream [J] . Ding Ing-Jr, Chang Yu-Jui Neurocomputing . 2017,第nova1期

机译：具有改进的基于特征提取的特征参数的HMM，通过使用感测到的Kinect数据流来识别手势命令操作员
3. Online Feature Selection (OFS) with Accelerated Bat Algorithm (ABA) and Ensemble Incremental Deep Multiple Layer Perceptron (EIDMLP) for big data streams [J] . D. Renuka Devi, S. Sasikala Journal of Big Data . 2019,第1期

机译：带有加速Bat算法（ABA）和集成增量深多层感知器（EIDMLP）的在线特征选择（OFS）用于大数据流
4. Feature extractionand Incremental Learning to Improve Activity Recognition on Streaming Data [C] . Nawel Yala, Belkacem Fergani, Anthony Fleury IEEE International Conference on Evolving and Adaptive Intelligent Systems . 2015

机译：特征提取和增量学习，提高流数据的活动识别
5. Incremental Sparse-PCA Feature Extraction For Data Streams. [D] . Nziga, Jean-Pierre. 2015

机译：数据流的增量式稀疏PCA特征提取。
6. Streaming chunk incremental learning for class-wise data stream classification with fast learning speed and low structural complexity [O] . Prem Junsawang, Suphakant Phimoltares, Chidchanok Lursinsap 2012

机译：流式块增量学习，用于以快速的学习速度和较低的结构复杂度对类数据流进行分类
7. On the utility of incremental feature selection for the classification of textual data streams [O] . Ioannis Katakis, Grigorios Tsoumakas, Ioannis Vlahavas 2005

机译：关于增量特征选择在文本数据流分类中的应用
8. Distributed Computing for Signal Processing: Modeling of Asynchronous Parallel Computation. Appendix D. Analysis of MIMD (Multiple Instruction Streams, Multiple Data Streams) Algorithms: Features, Measurements, and Results [R] . Smith, K. D. 1984

机译：信号处理的分布式计算：异步并行计算的建模。附录D. mImD（多指令流，多数据流）算法的分析：特征，测量和结果

Incremental Sparse-PCA Feature Extraction For Data Streams

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅