首页> 外文会议>IEEE International Conference on Software Quality, Reliability, and Security >Identification of Security Related Bug Reports via Text Mining Using Supervised and Unsupervised Classification
【24h】

Identification of Security Related Bug Reports via Text Mining Using Supervised and Unsupervised Classification

机译:通过使用监督和无监督的分类,通过文本挖掘确定安全相关错误报告

获取原文

摘要

While many prior works used text mining for automating different tasks related to software bug reports, few works considered the security aspects. This paper is focused on automated classification of software bug reports to security and not-security related, using both supervised and unsupervised approaches. For both approaches, three types of feature vectors are used. For supervised learning, we experiment with multiple classifiers and training sets with different sizes. Furthermore, we propose a novel unsupervised approach based on anomaly detection. The evaluation is based on three NASA datasets. The results showed that supervised classification is affected more by the learning algorithms than by feature vectors and training only on 25% of the data provides as good results as training on 90% of the data. The supervised learning slightly outperforms the unsupervised learning, at the expense of labeling the training set. In general, datasets with more security information lead to better performance.
机译:虽然许多先前的作品使用文本挖掘用于自动化与软件错误报告相关的不同任务,但很少有效地考虑了安全方面。本文的专注于使用监督和无监督和无监督的方法对安全性和不安全性的软件错误的自动分类。对于两种方法,使用三种类型的特征向量。对于监督学习,我们尝试多种分类器和具有不同尺寸的培训集。此外,我们提出了一种基于异常检测的新型无人监督方法。评估基于三个NASA数据集。结果表明,监督分类受到学习算法的影响,而不是通过特征向量,并且仅在25 %的数据中训练提供了90 %数据的培训。监督学习略高于无监督的学习,以牺牲培训集为代价。通常,具有更多安全信息的数据集导致性能更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号