Automated Duplicate Bug Report Classification Using Subsequence Matching

机译：使用子序列匹配的自动重复错误报告分类

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The use of open bug tracking repositories like Bugzilla is common in many software applications. They allow developers, testers and users the ability to report problems associated with the system and track resolution status. Open and democratic reporting tools, however, face one major challenge: users can, and often do, submit reports describing the same problem. Research in duplicate report detection has primarily focused on word frequency based similarity measures paying little regard to the context or structure of the reporting language. Thus, in large repositories, reports describing different issues may be marked as duplicates due to the frequent use of common words. In this paper, we present Factor LCS, a methodology which utilizes common sequence matching for duplicate report detection. We demonstrate the approach by analyzing the complete Fire fox bug repository up until March 2012 as well as a smaller subset of Eclipse dataset from January 1, 2008 to December 31, 2008. We achieve a duplicate recall rate above 70% with Fire fox, which exceeds the results reported on smaller subsets of the same repository.

机译：在许多软件应用程序中，都经常使用像Bugzilla这样的开放式错误跟踪存储库。它们使开发人员，测试人员和用户能够报告与系统相关的问题并跟踪解决状态。但是，开放和民主的报告工具面临一个重大挑战：用户可以并且经常确实提交描述相同问题的报告。重复报告检测的研究主要集中在基于词频的相似性度量上，而很少考虑报告语言的上下文或结构。因此，在大型存储库中，由于经常使用常用词，描述不同问题的报告可能会被标记为重复项。在本文中，我们提出了因子LCS，一种利用通用序列匹配进行重复报告检测的方法。我们通过分析直到2012年3月的完整Fire fox错误存储库以及从2008年1月1日至2008年12月31日的Eclipse数据集的较小子集来演示该方法。使用Fire fox，我们可以将重复调用率提高到70％以上。超出了在同一存储库的较小子集上报告的结果。

著录项

来源
《IEEE 14th International Symposium on High-Assurance Systems Engineering.》|2012年|p.74- 81|共8页
会议地点 Omaha NE(USA);Omaha NE(USA)
作者
Banerjee Sean; Cukic Bojan; Adjeroh Donald;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化系统;自动化系统;
关键词

相似文献

外文文献
中文文献
专利

1. An HMM-based approach for automatic detection and classification of duplicate bug reports [J] . Ebrahimi Neda, Trabelsi Abdelaziz, Islam Md Shariful, Information and software technology . 2019,第SEPa期

机译：基于HMM的方法，用于自动检测和分类重复的错误报告
2. An HMM-based approach for automatic detection and classification of duplicate bug reports [J] . Ebrahimi Neda, Trabelsi Abdelaziz, Islam Md Shariful, Information and software technology . 2019,第Sepa期

机译：基于HMM的自动检测方法和分类重复错误报告
3. A Novel Technique for Duplicate Detection and Classification of Bug Reports [J] . Tao ZHANG, Byungjeong LEE IEICE transactions on information and systems . 2014,第7期

机译：错误报告的重复检测和分类的新技术
4. Automated Duplicate Bug Report Classification Using Subsequence Matching [C] . Banerjee Sean, Cukic Bojan, Adjeroh Donald IEEE International Symposium on High-Assurance Systems Engineering . 2012

机译：使用子匹配自动重复错误报告分类
5. Automated Analysis of Bug Descriptions to Support Bug Reporting and Resolution [D] . Chaparro Arenas, Oscar Javier. 2019

机译：对支持Bug报告和解决方案的错误描述自动分析
6. Automated classification of eligibility criteria in clinical trials to facilitate patient-trial matching for specific patient populations [O] . Kevin Zhang, Dina Demner-Fushman 2017

机译：对临床试验中的资格标准进行自动分类以促进针对特定患者群体的患者试验匹配
7. An HMM-based approach for automatic detection and classification of duplicate bug reports [O] . Neda Ebrahimi, Abdelaziz Trabelsi, Md. Shariful Islam, 2019

机译：基于HMM的自动检测方法和分类重复错误报告

Automated Duplicate Bug Report Classification Using Subsequence Matching

摘要

著录项

相似文献

相关主题

期刊订阅