...
首页> 外文期刊>International journal of machine learning and cybernetics >Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network
【24h】

Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network

机译:使用Deep Bi-LSTM-CRF网络自动提取网络威胁的命名实体

获取原文
获取原文并翻译 | 示例
           

摘要

Countless cyber threat intelligence (CTI) reports are used by companies around the world on a daily basis for security reasons. To secure critical cybersecurity information, analysts and individuals should accordingly analyze information on threats and vulnerabilities. However, analyzing such overwhelming volumes of reports requires considerable time and effort. In this study, we propose a novel approach that automatically extracts core information from CTI reports using a named entity recognition (NER) system. During the process of constructing our proposed NER system, we defined meaningful keywords in the security domain as entities, including malware, domain/URL, IP address, Hash, and Common Vulnerabilities and Exposures. Furthermore, we linked these keywords with the words extracted from the text data of the report. To achieve a higher performance, we utilized the character-level feature vector as an input to bidirectional long-short-term memory using a conditional random field network. We finally achieved an average F1-score of 75.05%. We release 498,000 tag datasets created during our research.
机译:无数的网络威胁情报(CTI)报告由世界各地的公司每天用于安全原因。为了保护关键的网络安全信息,分析师和个人应该分析有关威胁和漏洞的信息。但是,分析了这种压倒性的报告需要相当长的时间和努力。在本研究中,我们提出了一种新的方法,它使用命名实体识别(ner)系统自动从CTI报告中提取核心信息。在构建我们提出的NER系统的过程中,我们将安全域中的有意义的关键字定义为实体,包括恶意软件,域/ URL,IP地址,哈希和常见漏洞和曝光。此外,我们将这些关键字与从报告的文本数据中提取的单词联系起来。为了实现更高的性能,我们利用字符级别传染媒介使用条件随机现场网络作为双向长短期内存的输入。我们终于实现了平均F1分数为75.05%。我们在研究期间释放了498,000个标签数据集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号