首页> 外文会议>International Conference on Web Research >New Methodology for Contextual Features Usage in Duplicate Bug Reports Detection : Dimension Expansion based on Manhattan Distance Similarity of Topics
【24h】

New Methodology for Contextual Features Usage in Duplicate Bug Reports Detection : Dimension Expansion based on Manhattan Distance Similarity of Topics

机译:关于重复错误报告检测的新方法使用内容功能使用:基于曼哈顿距离的主题相似性的维度扩展

获取原文

摘要

Duplicate bug report detection is one of the major problems in software triage systems like Bugzilla to deal with end user requests. User request contains some categorical and especially textual fields which need feature extraction for duplicate detection. Contextual and topical features are acquired using calculating cosine similarity between term frequency or inverse document frequency or BM25F technique from a pair of bug reports against some topics. This research proposes the individual Manhattan distance similarity approach instead of cosine distance similarity for every topic in contextual features to expand the feature dimension which can increase the accuracy of the duplicate bug report detection process. The four famous datasets of bug reports have used for evaluation of the proposed method including Android, Eclipse, Mozilla, and Open Office which the experimental results indicate performance improvement for four contextual features including general, cryptography, network, and Java topics.
机译:重复的错误报告检测是Bugzilla等软件分类系统中的主要问题之一,以处理最终用户请求。用户请求包含某种分类,尤其是需要提取要重复检测的特征提取。使用来自一些主题的一对错误报告的术语频率或逆文档频率或BM25F技术之间的计算余弦相似性获取上下文和局部特征。本研究提出了个体曼哈顿距离相似性方法而不是余弦距离相似性,用于扩展可以提高重复错误报告检测过程的准确性的特征维度。 4个着名的错误报告数据集用于评估所提出的方法,包括Android,Eclipse,Mozilla和Open Office,该方法实验结果表明四个语境功能的性能改进,包括一般,加密,网络和Java主题。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号