首页> 外文会议>ACM symposium on Applied Computing >Semi-supervised co-training and active learning based approach for multi-view intrusion detection
【24h】

Semi-supervised co-training and active learning based approach for multi-view intrusion detection

机译:多视图入侵检测的半监督共同培训与主动学习方法

获取原文

摘要

Although there is immense data available from networks and hosts, a very small proportion of this data is labeled due to the cost of obtaining expert labels. This proves to be a significant bottle-neck for developing supervised intrusion detection systems that rely solely on labeled data. In spite of the data being collected from real network environments and hence potentially holding valuable information for intrusion detection, such systems can not exploit the remaining unlabeled data. In this work, we intelligently leverage both labeled and unlabeled data. Also, intrusion detection tasks naturally lend themselves into a multi-view scenario, and can benefit significantly if these multiple views are combined meaningfully. In this paper, we propose a co-training method framework for intrusion detection, which is a semi-supervised learning method and can not only utilize unlabeled data, but can also combine multi-view data. We also employ an active learning framework where statistically ambiguous parts of the unlabeled data are identified, which can then be labeled by an expert. This allows for minimal expert labeling while ensuring that the labels obtained from the expert are most informative. In our experiments, we demonstrate that leveraging the unlabeled data using our proposed method significantly reduces the error rate as compared to using the labeled data alone. In addition, our proposed multi-view method has a lower error rate than using a single view.
机译:尽管存在从网络和主机获得的巨大数据,但由于获取专家标签的成本,因此标记了非常小的此数据。这被证明是开发监督入侵检测系统的重要瓶颈,该系统依赖于标记数据。尽管从真实网络环境中收集数据并因此潜在地持有用于入侵检测的宝贵信息,但这种系统无法利用剩余的未标记数据。在这项工作中,我们智能地利用标记和未标记的数据。此外,入侵检测任务自然地将自己归因于多视图场景,并且如果这些多个视图有意义地组合,则可以显着受益。在本文中,我们提出了一种用于入侵检测的共同训练方法框架,这是一个半监督的学习方法,不仅可以利用未标记的数据,而且还可以组合多视图数据。我们还采用了主动学习框架,识别出未标记数据的统计暧昧部分,然后可以由专家标记。这允许最小的专家标签,同时确保从专家获得的标签最具信息丰富。在我们的实验中,我们证明利用我们所提出的方法利用未标记的数据显着降低了与使用标记数据单独使用的错误率。此外,我们所提出的多视图方法的错误率较低,而不是使用单个视图。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号