首页> 外文会议>IEEE International Conference on Trust, Security and Privacy in Computing and Communications >Joint Learning for Document-Level Threat Intelligence Relation Extraction and Coreference Resolution Based on GCN
【24h】

Joint Learning for Document-Level Threat Intelligence Relation Extraction and Coreference Resolution Based on GCN

机译:基于GCN的文档级威胁情报相关提取和CUSEREDED分辨率的联合学习

获取原文
获取外文期刊封面目录资料

摘要

In order to help researchers quickly understand the connection between new threat events and previous threat events, threat intelligence document-level relation extraction plays a very important role in threat intelligence text analysis and processing. Because there is no public document-level threat intelligence dataset, we create APTERC-DOC, an APT intelligence entities, relations and coreference dataset. We treat the relation extraction as a multi-classification task. Treating the coreference relation as a kind of predefined relations, we develop a joint learning framework called TIRECO, a model which can simultaneously complete threat intelligence relation extraction and coreference resolution. In order to solve the problem of document-level text being too long to extract feature, we propose the concept of sentence set, which transforms document-level relation extraction into inter-sentence relation extraction. To incorporate relevant information with maximally removing irrelevant content in sentence set, we further apply a novel pruning strategy (SDP-VP-SET) to the input trees considering that verbs are crucial in determining the relation between entities in sentence set. With retaining the shortest path and nodes that are K hops away from the shortest path, we give the edge connected to the verb nodes a weight of w times. Experimental results show that our model not only performs well in the extraction of inter-sentence relations, it is also effective in intra-sentence relations, and the F1 value has increased by 15.694%.
机译:为了帮助科研人员快速了解新威胁的事件和以往的威胁事件之间的联系,威胁情报文档级关系抽取起着威胁智能文本分析和处理非常重要的作用。由于没有公开的文件级威胁情报数据集,我们创建APTERC-DOC,APT的智能实体,关系和共指数据集。我们对待关系抽取作为多分类任务。治疗的共参照关系作为一种预定义的关系,我们开发称为TIRECO共同学习框架,模型可以同时完成威胁情报关系抽取和指代消解。为了解决文档级文本太长进行特征提取的问题,提出了句集,转换文档级关系抽取成句间关系抽取的概念。为了将与最大去除句子组不相关的内容相关的信息,我们还运用一种新的剪枝策略(SDP-VP-SET)的输入树木考虑动词在确定句子组实体之间的关系是至关重要的。随着保持的最短路径,并且有K跳跃距离最短路径走的节点,我们给连接到动词的边缘节点的W倍的重量。实验结果表明,我们的模型不仅在句间关系的提取表现良好,这也是有效的内部句子的关系,以及F1值增长了15.694%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号