首页> 外文期刊>International journal of software engineering and knowledge engineering >PRST: A PageRank-Based Summarization Technique for Summarizing Bug Reports with Duplicates
【24h】

PRST: A PageRank-Based Summarization Technique for Summarizing Bug Reports with Duplicates

机译:PRST:一种基于PageRank的摘要技术,用于汇总重复的错误报告

获取原文
获取原文并翻译 | 示例

摘要

Duplicate bug reports; PageRank; summarization; supervised learning%During software maintenance, bug reports are widely employed to improve the software project's quality. A developer often refers to stowed bug reports in a repository for bug resolution. However, this reference process often requires a developer to pursue a substantial amount of textual information in bug reports which is lengthy and tedious. Automatic summarization of bug reports is one way to overcome this problem. Both supervised and unsupervised methods are effectively proposed for the automatic summary generation of bug reports. However, existing methods disregard the significance of duplicate bug reports in summarizing bug reports. In this study, we propose a PageRank-based Summarization Technique (PRST), which utilizes the textual information contained in bug reports and additional information in associated duplicate bug reports. PRST uses three variants of PageRank-based on Vector Space Model (VSM), Jaccard, and WordNet similarity metrics. These variants are utilized to calculate the textual similarity of the sentences between the master bug reports and their duplicates. PRST further trains a regression model and predicts the probability of sentences belonging to the summary. Finally, we combine the values of PageRank and regression model scores to rank the sentences and produce the summary for the master bug reports. In addition, we construct two corpora of bug reports and duplicates, i.e. MBRC and OSCAR. Empirical results suggest that PRST outperforms the state-of-the-art method BRC in terms of Precision, Recall, F-score, and Pyramid Precision. Meanwhile, PRST with WordNet achieves the best results against PRST with VSM and Jaccard.
机译:错误报告重复;网页排名;总结有监督的学习%在软件维护期间,错误报告被广泛采用以提高软件项目的质量。开发人员通常参考存储库中的错误报告来解决错误。但是,此参考过程通常要求开发人员在错误报告中追求大量的文本信息,这既冗长又乏味。错误报告的自动摘要是解决此问题的一种方法。有效地提出了有监督和无监督的方法来自动汇总错误报告。但是,现有方法在总结错误报告时忽略了重复错误报告的重要性。在这项研究中,我们提出了一种基于PageRank的摘要技术(PRST),该技术利用了错误报告中包含的文本信息以及相关的重复错误报告中的其他信息。 PRST使用基于向量空间模型(VSM),Jaccard和WordNet相似性指标的PageRank的三种变体。这些变体用于计算主要错误报告及其副本之间句子的文本相似度。 PRST进一步训练回归模型,并预测句子属于摘要的概率。最后,我们结合PageRank和回归模型得分的值来对句子进行排名,并生成主要错误报告的摘要。此外,我们构造了两个错误报告和重复项的语料库,即MBRC和OSCAR。实验结果表明,PRST在精度,召回率,F得分和金字塔精度方面均优于最新方法BRC。同时,使用WordNet的PRST与使用VSM和Jaccard的PRST取得最佳效果。

著录项

  • 来源
  • 作者单位

    Key Laboratory for Ubiquitous Network, Service Software of Liaoning Province, School of Software, Dalian University of Technology, Dalian, China;

    Faculty of Information Technology, Monash University, Australia;

    Key Laboratory for Ubiquitous Network, Service Software of Liaoning Province, School of Software, Dalian University of Technology, Dalian, China;

    College of Computer Science and Technology, Harbin Engineering University, Harbin, China;

    Key Laboratory for Ubiquitous Network, Service Software of Liaoning Province, School of Software, Dalian University of Technology, Dalian, China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号