首页> 外文会议>International Joint Conference on Neural Networks >Collecting Indicators of Compromise from Unstructured Text of Cybersecurity Articles using Neural-Based Sequence Labelling
【24h】

Collecting Indicators of Compromise from Unstructured Text of Cybersecurity Articles using Neural-Based Sequence Labelling

机译:使用基于神经的序列标签从网络安全文章的非结构化文本中收集危害指标

获取原文

摘要

Indicators of Compromise (IOCs) are artifacts observed on a network or in an operating system that can be utilized to indicate a computer intrusion and detect cyber-attacks in an early stage. Thus, they exert an important role in the field of cybersecurity. However, state-of-the-art IOCs detection systems rely heavily on hand-crafted features with expert knowledge of cybersecurity, and require large-scale manually annotated corpora to train an IOC classifier. In this paper, we propose using an end-to-end neural-based sequence labelling model to identify IOCs automatically from cybersecurity articles without expert knowledge of cybersecurity. By using a multi-head self-attention module and contextual features, we find that the proposed model is capable of gathering contextual information from texts of cybersecurity articles and performs better in the task of IOC identification. Experiments show that the proposed model outperforms other sequence labelling models, achieving the average F1-score of 89.0% on English cybersecurity article test set, and approximately the average F1-score of 81.8% on Chinese test set.
机译:危害指标(IOC)是在网络或操作系统中观察到的伪像,可用于在早期阶段指示计算机入侵并检测网络攻击。因此,它们在网络安全领域中发挥着重要作用。但是,最新的IOC检测系统严重依赖具有网络安全专家知识的手工制作功能,并且需要大规模的手动注释语料库来训练IOC分类器。在本文中,我们建议使用端到端基于神经的序列标记模型从网络安全文章中自动识别IOC,而无需具备网络安全方面的专业知识。通过使用多头自我关注模块和上下文特征,我们发现所提出的模型能够从网络安全文章的文本中收集上下文信息,并且在IOC识别任务中表现更好。实验表明,该模型优于其他序列标记模型,在英语网络安全文章测试集上的平均F1得分达到89.0%,在中文测试集上的平均F1得分约为81.8%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号