...
首页> 外文期刊>The Journal of Systems and Software >Improving software bug-specific named entity recognition with deep neural network
【24h】

Improving software bug-specific named entity recognition with deep neural network

机译:用深神经网络改进软件错误的命名实体识别

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

There is a large volume of bug data in the bug repository, which contains rich bug information. Existing studies on bug data mining mainly rely on using information retrieval (IR) technology to search relevant historical bug reports. These studies basically treat a bug report as a closed unit, ignoring the semantic and structural information within it. Named-entity recognition (NER) is an important task of information extraction (IE) technology. Based on NER, fine-grained factual information could be comprehensively extracted to further form structured data, which provides a new way to improve the accessibility of bug information. However, bug NER is different from general NER tasks. Bug reports are free-form text, which include a mixed language environment studded with code, abbreviations and software-specific vocabularies. In this paper, we propose a deep neural network approach for bug-specific entity recognition called DBNER using bidirectional long short-term memory (LSTM) with Conditional Random Fields decoding model (CRF). DBNER extracts multiple features from the massive bug data and uses attention mechanism to improve the consistency of entity tags in the bug reports. Experiment results show that the F1-score reaches an average of 91.19%. In addition, in cross-project experiments, the DBNER's F1-score reaches an average of 84%.
机译:错误存储库中有大量的错误数据,其中包含丰富的错误信息。关于BUG数据挖掘的现有研究主要依靠使用信息检索(IR)技术来搜索相关的历史错误报告。这些研究基本上将错误报告视为封闭单元,忽略它内部的语义和结构信息。命名实体识别(ner)是信息提取(IE)技术的重要任务。基于NER,可以全面提取细粒度的事实信息,以进一步形成结构化数据,这提供了一种提高错误信息可访问性的新方法。但是,Bug ner与常规行列任务不同。错误报告是免费表单文本,其中包括浏览代码,缩写和特定于特定于软件的词汇表的混合语言环境。在本文中,我们向使用双向长短期存储器(LSTM)提出了一种被称为DBNER的错误特定实体识别的深度神经网络方法,其具有条件随机字段解码模型(CRF)。 DBNER从大规模错误数据中提取多个功能,并使用注意机制来提高错误报告中实体标记的一致性。实验结果表明,F1分数平均达到91.19%。此外,在交叉项目实验中,DBNER的F1分数平均达到84%。

著录项

  • 来源
    《The Journal of Systems and Software》 |2020年第7期|110572.1-110572.16|共16页
  • 作者

    Cheng Zhou; Bin Li; Xiaobing Sun;

  • 作者单位

    School of Information Engineering Yangzhou University Yangzhou China Taizhou University Taizhou China;

    School of Information Engineering Yangzhou University Yangzhou China State Key Laboratory for Novel Software Technology Nanjing University Nanjing 210023 China;

    School of Information Engineering Yangzhou University Yangzhou China Key Laboratory of Safety-Critical Software Nanjing University of Aeronautics and Astronautics Ministry of Industry and Information Technology Nanjing China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Software bug analysis; Named entity recognition; Software bug corpus; LSTM-CRF;

    机译:软件错误分析;命名实体识别;软件错误语料库;LSTM-CRF.;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号