Incremental Sparse-PCA Feature Extraction For Data Streams.

机译：数据流的增量式稀疏PCA特征提取。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Intruders attempt to penetrate commercial systems daily and cause considerable financial losses for individuals and organizations. Intrusion detection systems monitor network events to detect computer security threats. An extensive amount of network data is devoted to detecting malicious activities.;Storing, processing, and analyzing the massive volume of data is costly and indicate the need to find efficient methods to perform network data reduction that does not require the data to be first captured and stored. A better approach allows the extraction of useful variables from data streams in real time and in a single pass. The removal of irrelevant attributes reduces the data to be fed to the intrusion detection system (IDS) and shortens the analysis time while improving the classification accuracy. This dissertation introduces an online, real time, data processing method for knowledge extraction.;This incremental feature extraction is based on two approaches. First, Chunk Incremental Principal Component Analysis (CIPCA) detects intrusion in data streams. Then, two novel incremental feature extraction methods, Incremental Structured Sparse PCA (ISSPCA) and Incremental Generalized Power Method Sparse PCA (IGSPCA), find malicious elements. Metrics helped compare the performance of all methods.;The IGSPCA was found to perform as well as or better than CIPCA overall in term of dimensionality reduction, classification accuracy, and learning time. ISSPCA yielded better results for higher chunk values and greater accumulation ratio thresholds. CIPCA and IGSPCA reduced the IDS dataset to 10 principal components as opposed to 14 eigenvectors for ISSPCA. ISSPCA is more expensive in terms of learning time in comparison to the other techniques.;This dissertation presents new methods that perform feature extraction from continuous data streams to find the small number of features necessary to express the most data variance. Data subsets derived from a few important variables render their interpretation easier.;Another goal of this dissertation was to propose incremental sparse PCA algorithms capable to process data with concept drift and concept shift. Experiments using WaveForm and WaveFormNoise datasets confirmed this ability. Similar to CIPCA, the ISSPCA and IGSPCA updated eigen-axes as a function of the accumulation ratio value, forming informative eigenspace with few eigenvectors.

机译：入侵者试图每天侵入商业系统，并给个人和组织造成可观的经济损失。入侵检测系统监视网络事件以检测计算机安全威胁。大量的网络数据专用于检测恶意活动。;存储，处理和分析海量数据非常昂贵，并且表明需要找到有效的方法来执行网络数据缩减，而这种方法不需要首先捕获数据并存储。一种更好的方法允许实时，单次从数据流中提取有用的变量。无关属性的删除减少了要馈送到入侵检测系统（IDS）的数据，并缩短了分析时间，同时提高了分类准确性。本文介绍了一种在线，实时，数据处理的知识提取方法。该增量特征提取基于两种方法。首先，块增量主成分分析（CIPCA）检测数据流中的入侵。然后，两种新颖的增量特征提取方法，即增量结构稀疏PCA（ISSPCA）和增量广义幂方法稀疏PCA（IGSPCA），发现了恶意元素。度量标准有助于比较所有方法的性能。在减少维度，分类准确性和学习时间方面，发现IGSPCA的总体性能优于或优于CIPCA。对于更高的组块值和更高的累积比率阈值，ISSPCA产生了更好的结果。 CIPCA和IGSPCA将IDS数据集减少到10个主要成分，而不是ISSPCA的14个特征向量。与其他技术相比，ISSPCA在学习时间上更昂贵。本论文提出了从连续数据流中进行特征提取以发现表示最大数据差异所需的少量特征的新方法。由几个重要变量衍生的数据子集使它们的解释更容易。本论文的另一个目标是提出能够处理概念漂移和概念移位的数据的增量式稀疏PCA算法。使用WaveForm和WaveFormNoise数据集的实验证实了这种能力。与CIPCA相似，ISSPCA和IGSPCA更新了特征轴作为累积比率值的函数，从而形成了信息量少的特征向量的特征空间。

著录项

作者
Nziga, Jean-Pierre.;
展开▼
作者单位

Nova Southeastern University.;

展开▼
授予单位 Nova Southeastern University.;
学科 Computer science.;Information science.
学位 Ph.D.
年度 2015
页码 127 p.
总页数 127
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Semi-supervised incremental feature extraction algorithm forrnlarge-scale data stream [J] . Chao Tan, Genlin Ji Concurrency and computation: practice and experience . 2017,第6期

机译：大型数据流的半监督增量特征提取算法
2. An optimized feature selection technique based on incremental feature analysis for bio-metric gait data classification [J] . Semwal Vijay Bhaskar, Singha Joyeeta, Sharma Pinki Kumari, Multimedia Tools and Applications . 2017,第22期

机译：基于增量特征分析的生物特征步态数据分类优化特征选择技术
3. An Incremental Version of L-MVU for the Feature Extraction of MI-EEG [J] . Mingai Li, Hongwei Xi, Xiaoqing Zhu Computational intelligence and neuroscience . 2019,第4期

机译：L-MVU的增量版本，用于MI-EEG的特征提取
4. Aspect based feature extraction and sentiment classification of review data sets using Incremental machine learning algorithm [C] . Rajalaxmi Hegde, Seema S. International Conference on Advances in Electrical, Electronics, Information, Communication and Bio-Informatics . 2017

机译：使用增量机器学习算法的基于方面的评论数据集特征提取和情感分类
5. INTEGRATION OF SOLID MODELING AND DATABASE MANAGEMENT FOR CAD/CAM (QUERY LANGUAGE, GEOMETRIC DATABASE, FEATURE EXTRACTION) [D] . LEE, YUNG-CHIA. 1984

机译：CAD / CAM的实体建模与数据库管理的集成（查询语言，几何数据库，特征提取）
6. An Incremental Version of L-MVU for the Feature Extraction of MI-EEG [O] . Mingai Li, Hongwei Xi, Xiaoqing Zhu 2019

机译：用于MI-EEG特征提取的L-MVU增量版本
7. Incremental Sparse-PCA Feature Extraction For Data Streams [O] . Nziga Jean-Pierre 2015

机译：数据流的增量式稀疏PCA特征提取
8. Unsupervised Feature Selection on Data Streams. [R] . H., H., S., Y., Kasiviswanathan, S. 2015

机译：数据流上的无监督特征选择。

Incremental Sparse-PCA Feature Extraction For Data Streams.

摘要

著录项

相似文献

相关主题

期刊订阅