首页> 外文会议>ACM symposium on Applied Computing >Semi-supervised co-training and active learning based approach for multi-view intrusion detection
【24h】

Semi-supervised co-training and active learning based approach for multi-view intrusion detection

机译:基于半监督协同训练和主动学习的多视图入侵检测方法

获取原文

摘要

Although there is immense data available from networks and hosts, a very small proportion of this data is labeled due to the cost of obtaining expert labels. This proves to be a significant bottle-neck for developing supervised intrusion detection systems that rely solely on labeled data. In spite of the data being collected from real network environments and hence potentially holding valuable information for intrusion detection, such systems can not exploit the remaining unlabeled data. In this work, we intelligently leverage both labeled and unlabeled data. Also, intrusion detection tasks naturally lend themselves into a multi-view scenario, and can benefit significantly if these multiple views are combined meaningfully. In this paper, we propose a co-training method framework for intrusion detection, which is a semi-supervised learning method and can not only utilize unlabeled data, but can also combine multi-view data. We also employ an active learning framework where statistically ambiguous parts of the unlabeled data are identified, which can then be labeled by an expert. This allows for minimal expert labeling while ensuring that the labels obtained from the expert are most informative. In our experiments, we demonstrate that leveraging the unlabeled data using our proposed method significantly reduces the error rate as compared to using the labeled data alone. In addition, our proposed multi-view method has a lower error rate than using a single view.
机译:尽管可以从网络和主机获得大量数据,但是由于获得专家标签的成本较高,因此很少一部分数据被标记。这被证明是开发仅依赖标记数据的监督型入侵检测系统的重要瓶颈。尽管从真实的网络环境中收集了数据,因此潜在地保留了用于入侵检测的有价值的信息,但此类系统无法利用剩余的未标记数据。在这项工作中,我们智能地利用了标记和未标记的数据。此外,入侵检测任务自然可以将自己引入多视图场景,并且如果将这些多视图有意义地组合在一起,则可以显着受益。在本文中,我们提出了一种用于入侵检测的协同训练方法框架,它是一种半监督学习方法,不仅可以利用未标记的数据,而且可以组合多视图数据。我们还采用了主动学习框架,可以在其中识别出未标记数据的统计模糊部分,然后可以由专家对其进行标记。这样可确保专家标签最少,同时确保从专家那里获得的标签信息最丰富。在我们的实验中,我们证明了与仅使用标记数据相比,使用我们提出的方法利用未标记数据可以显着降低错误率。另外,我们提出的多视图方法比使用单视图的方法具有更低的错误率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号