首页> 外文学位 >Data mining techniques for network scan detection.
【24h】

Data mining techniques for network scan detection.

机译:用于网络扫描检测的数据挖掘技术。

获取原文
获取原文并翻译 | 示例

摘要

A precursor to many attacks on networks is often a reconnaissance operation, more commonly referred to as a scan. Despite the vast amount of attention focused on methods for scan detection, the state-of-the-art methods suffer from high rate of false alarms and low rate of scan detection.; In this thesis, we formalize the problem of scan detection as a data mining problem. We show how a network traffic data set can be converted into a data set that is appropriate for off-the-shelf classifiers. Our method successfully demonstrates that data mining models can encapsulate expert knowledge to create an adaptable algorithm that can substantially outperform state-of-the-art methods for scan detection in both coverage and precision. Specifically, we show that our method is capable of very early detection (in many cases, as early as the first connection attempt on the specific port) without significantly compromising the precision of the detection and is capable of distinguishing P2P and backscatter traffic from scanners.; Using off-the-shelf classifiers as scan detectors is very effective but it requires a training data set whose instances are labeled to indicate the correct class assignment. In rapidly changing fields, like computer network traffic analysis, the availability of up-to-date labeled data sets is very limited. This is primarily a consequence of the excessively high cost of an expert manually labeling these large data sets. In this research, we also propose a method, where labeling the data set is carried out in a semi-supervised manner with user-specified guarantees about the quality of the labeling.; Thirdly and lastly, we also propose a method for estimating the performance of the classifier (scan detector) when labeled data is unavailable.
机译:网络上许多攻击的前兆通常是侦察操作,通常称为扫描。尽管将大量注意力集中在扫描检测方法上,但是现有技术的方法遭受误报率高和扫描检测率低的困扰。在本文中,我们将扫描检测问题形式化为数据挖掘问题。我们展示了如何将网络流量数据集转换为适用于现成分类器的数据集。我们的方法成功地证明了数据挖掘模型可以封装专家知识,以创建一种适应性强的算法,该算法在覆盖率和精度上都可以大大胜过用于扫描检测的最新方法。具体来说,我们证明了我们的方法能够非常早地检测(在许多情况下,最早是在特定端口上的首次连接尝试),而不会显着损害检测的精度,并且能够区分扫描程序的P2P和反向散射流量。 ;使用现成的分类器作为扫描检测器非常有效,但是它需要训练数据集,其实例被标记以指示正确的类分配。在快速变化的领域中,例如计算机网络流量分析,最新的标记数据集的可用性非常有限。这主要是专家手动标记这些大数据集的成本过高的结果。在这项研究中,我们还提出了一种方法,其中以用户指定的关于标签质量的保证的半监督方式对数据集进行标签。第三,最后,我们还提出了一种在标签数据不可用时估计分类器(扫描检测器)性能的方法。

著录项

  • 作者

    Simon, Gyorgy J.;

  • 作者单位

    University of Minnesota.;

  • 授予单位 University of Minnesota.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 157 p.
  • 总页数 157
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号