首页> 外文会议>Proceedings of the Sixth Symposium on Operating Systems Design and Implementation(OSDI'04) >CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code
【24h】

CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code

机译:CP-Miner:一种用于查找操作系统代码中的复制粘贴和相关错误的工具

获取原文
获取原文并翻译 | 示例

摘要

Copy-pasted code is very common in large software because programmers prefer reusing code via copy-paste in order to reduce programming effort. Recent studies show that copy-paste is prone to introducing bugs and a significant portion of operating system bugs concentrate in copy-pasted code. Unfortunately, it is challenging to efficiently identify copy-pasted code in large software. Existing copy-paste detection tools are either not scalable to large software, or cannot handle small modifications in copy-pasted code. Furthermore, few tools are available to detect copy-paste related bugs.rnIn this paper we propose a tool, CP-Miner, that uses data mining techniques to efficiently identify copy-pasted code in large software including operating systems, and detects copy-paste related bugs. Specifically, it takes less than 20 minutes for CP-Miner to identify 190,000 copy-pasted segments in Linux and 150,000 in FreeBSD. Moreover, CP-Miner has detected 28 copy-paste related bugs in the latest version of Linux and 23 in FreeBSD. In addition, we analyze some interesting characteristics of copy-paste in Linux and FreeBSD, including the distribution of copy-pasted code across different length, granularity, modules, degrees of modification, and various software versions.
机译:复制粘贴的代码在大型软件中非常常见,因为程序员更喜欢通过复制粘贴来重用代码,以减少编程工作量。最近的研究表明,复制粘贴易于引入错误,并且操作系统错误的很大一部分都集中在复制粘贴的代码中。不幸的是,在大型软件中有效地识别复制粘贴的代码具有挑战性。现有的复制粘贴检测工具要么无法扩展到大型软件,要么无法处理复制粘贴代码中的小修改。此外,很少有工具可用于检测与复制粘贴相关的错误。在本文中,我们提出了一种工具CP-Miner,该工具使用数据挖掘技术来有效地识别大型软件(包括操作系统)中的复制粘贴代码,并检测复制粘贴。相关的错误。具体来说,CP-Miner在Linux中识别190,000个复制粘贴的段,在FreeBSD中识别150,000个复制段,所需的时间不到20分钟。此外,CP-Miner在最新版本的Linux中检测到28个与复制粘贴相关的错误,在FreeBSD中检测到23个。此外,我们分析了Linux和FreeBSD中复制粘贴的一些有趣特征,包括复制粘贴代码在不同长度,粒度,模块,修改程度和各种软件版本之间的分布。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号