首页> 外文期刊>Software, practice & experience >Using Stack Overflow content to assist in code review
【24h】

Using Stack Overflow content to assist in code review

机译:使用堆栈溢出内容来协助代码审核

获取原文
获取原文并翻译 | 示例
       

摘要

An essential goal for programmers is to minimize the cost of identifying and correcting defects in source code. Code review is commonly used for identifying programming defects. However, manual code review has some shortcomings: (1) it is time-consuming and (2) outcomes are subjective and depend on the skills of reviewers. An automated approach for assisting in code reviews is thus highly desirable. We present a tool for assisting in code review and results from our experiments evaluating the tool in different scenarios. The tool leveraged content available from professional programmer support forums (eg, StackOverflow.com) to determine potential defectiveness of a given piece of source code. The defectiveness is expressed on the scale of {Likely defective, neutral, unlikely to be defective}. The basic idea employed in the tool is (1) to identify a set P of discussion posts on Stack Overflow such that each p is an element of P contains source code fragment(s), which sufficiently resemble the input code C being reviewed, and (2) to determine the likelihood of C being defective by considering all p is an element of P. A novel aspect of our approach is to use document fingerprinting for comparing two pieces of source code. Our choice of document fingerprinting technique is inspired by source code plagiarism detection tools where it has proven to be very successful. In the experiments that we performed to verify the effectiveness of our approach, source code samples from more than 300 GitHub open-source repositories were taken as input. An F1 score of 0.94 has been achieved in identifying correct/relevant results.
机译:程序员的重要目标是最大限度地减少识别和纠正源代码缺陷的成本。代码审查通常用于识别编程缺陷。但是,手动代码审查有一些缺点:(1)它是耗时的,(2)结果是主观的,取决于审稿人的技能。因此,非常希望辅助的自动化方法是非常理想的。我们提出了一种协助我们在不同场景中评估工具的实验的代码审查和结果的工具。该工具利用专业程序员支持论坛(例如StackOverFlow.com)可获得的内容,以确定给定的源代码的潜在缺陷。缺陷表达了{可能有缺陷,中性,不太可能有缺陷的}。工具中使用的基本思想是(1),以识别堆栈溢出上的讨论帖子的集合P,使得每个P是P的一个元素,其中包含源代码片段,其足够类似于正在审查的输入代码C,并且(2)通过考虑所有P是P的一个元素来确定C有缺陷的可能性。我们的方法的新方面是使用文档指纹进行比较两条源代码。我们选择的文档指纹技术受到源代码抄袭检测工具的启发,从而证明是非常成功的。在我们执行验证我们方法的有效性的实验中,从超过300个GitHub开源存储库的源代码样本被视为输入。在识别正确/相关结果时,已经实现了0.94的F1得分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号