Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network

Kim Gyeongmin; Lee Chanhee; Jo Jaechoon; Lim Heuiseok

首页> 外文期刊>International journal of machine learning and cybernetics >Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network

【24h】

Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network

机译：使用Deep Bi-LSTM-CRF网络自动提取网络威胁的命名实体

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Countless cyber threat intelligence (CTI) reports are used by companies around the world on a daily basis for security reasons. To secure critical cybersecurity information, analysts and individuals should accordingly analyze information on threats and vulnerabilities. However, analyzing such overwhelming volumes of reports requires considerable time and effort. In this study, we propose a novel approach that automatically extracts core information from CTI reports using a named entity recognition (NER) system. During the process of constructing our proposed NER system, we defined meaningful keywords in the security domain as entities, including malware, domain/URL, IP address, Hash, and Common Vulnerabilities and Exposures. Furthermore, we linked these keywords with the words extracted from the text data of the report. To achieve a higher performance, we utilized the character-level feature vector as an input to bidirectional long-short-term memory using a conditional random field network. We finally achieved an average F1-score of 75.05%. We release 498,000 tag datasets created during our research.

机译：无数的网络威胁情报（CTI）报告由世界各地的公司每天用于安全原因。为了保护关键的网络安全信息，分析师和个人应该分析有关威胁和漏洞的信息。但是，分析了这种压倒性的报告需要相当长的时间和努力。在本研究中，我们提出了一种新的方法，它使用命名实体识别（ner）系统自动从CTI报告中提取核心信息。在构建我们提出的NER系统的过程中，我们将安全域中的有意义的关键字定义为实体，包括恶意软件，域/ URL，IP地址，哈希和常见漏洞和曝光。此外，我们将这些关键字与从报告的文本数据中提取的单词联系起来。为了实现更高的性能，我们利用字符级别传染媒介使用条件随机现场网络作为双向长短期内存的输入。我们终于实现了平均F1分数为75.05％。我们在研究期间释放了498,000个标签数据集。

著录项

来源
《International journal of machine learning and cybernetics》 |2020年第10期|2341-2355|共15页
作者
Kim Gyeongmin; Lee Chanhee; Jo Jaechoon; Lim Heuiseok;
展开▼
作者单位

Korea Univ Seoul 02841 South Korea;

Korea Univ Seoul 02841 South Korea;

Hanshin Univ 137 Hanshindae Gil Osan Si 18101 South Korea;

Korea Univ Seoul 02841 South Korea;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Cybersecurity; Vulnerability; Cyber threat intelligence; Named entity recognition; Bidirectional long-short-term memory conditional random field;

机译：网络安全;漏洞;网络威胁情报;命名实体识别;双向长期记忆条件随机字段;

相似文献

外文文献
中文文献
专利

1. Recognition and extraction of named entities in online medical diagnosis data based on a deep neural network [J] . Liu Xin, Zhou Yanju, Wang Zongrun Journal of visual communication & image representation . 2019,第Apra期

机译：基于深神经网络的在线医学诊断数据中指定实体的识别与提取
2. ANEEC: A Quasi-Automatic System for Massive Named Entity Extraction and Categorization [J] . Bingyue Peng, Junjie Wu, Hua Yuan, The Computer journal . 2013,第11期

机译：ANEEC：大规模自动命名实体提取和分类的准自动系统
3. Automatic Extraction Of The Fine Category Of Person Named Entities From Text Corpora [J] . Tri, Thanh NGUYEN, Akira SHIMAZU IEICE Transactions on Information and Systems . 2007,第10期

机译：从文本语料库中自动提取人员命名实体的精细类别
4. An Effective Approach of Named Entity Recognition for Cyber Threat Intelligence [C] . Han Wu, Xiaoyong Li, Yali Gao IEEE Information Technology, Networking, Electronic and Automation Control Conference . 2020

机译：网络威胁情报中命名实体识别的有效方法
5. Learning for information extraction: From named entity recognition and disambiguation to relation extraction. [D] . Bunescu, Razvan Constantin. 2007

机译：学习信息提取：从命名实体识别和歧义消除到关系提取。
6. Named-Entity-Recognition-Based Automated System for Diagnosing Cybersecurity Situations in IoT Networks [O] . Tiberiu-Marian Georgescu, Bogdan Iancu, Madalina Zurini 2019

机译：基于命名实体识别的物联网网络安全状况自动诊断系统
7. Named-Entity-Recognition-Based Automated System for Diagnosing Cybersecurity Situations in IoT Networks [O] . Tiberiu-Marian Georgescu, Bogdan Iancu, Madalina Zurini 2019

机译：基于名称的实体识别的自动化系统，用于诊断IOT网络中的网络安全情况

Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network

摘要

著录项

相似文献

相关主题

期刊订阅