首页> 外文会议>IEEE Symposium on Visual Languages and Human-Centric Computing >Semantic Clone Detection: Can Source Code Comments Help?
【24h】

Semantic Clone Detection: Can Source Code Comments Help?

机译:语义克隆检测:源代码注释可以帮助吗?

获取原文

摘要

Programmers reuse code to increase their productivity, which leads to large fragments of duplicate or near-duplicate code in the code base. The current code clone detection techniques for finding semantic clones utilize Program Dependency Graphs (PDG), which are expensive and resource-intensive. PDG and other clone detection techniques utilize code and have completely ignored the comments - due to ambiguity of English language, but in terms of program comprehension, comments carry the important domain knowledge. We empirically evaluated the accuracy of detecting clones with both code and comments on a JHotDraw package. Results show that detecting code clones in the presence of comments, Latent Dirichlet Allocation (LDA), gave 84% precision and 94% recall, while in the presence of a PDG, using GRAPLE, we got 55% precision and 29% recall. These results indicate that comments can be used to find semantic clones. We recommend utilizing comments with LDA to find clones at the file level and code with PDG for finding clones at the function level. These findings necessitate a need to reexamine the assumptions regarding semantic clone detection techniques.
机译:程序员重用代码以提高生产率,这会导致代码库中出现大量重复或接近重复的代码。用于查找语义克隆的当前代码克隆检测技术利用程序依赖图(PDG),它很昂贵且占用大量资源。 PDG和其他克隆检测技术使用代码,并且完全忽略了注释-由于英语的歧义,但是就程序理解而言,注释具有重要的领域知识。我们根据经验评估了在JHotDraw软件包上使用代码和注释检测克隆的准确性。结果表明,在注释(潜在狄利克雷分配)(Latent Dirichlet Allocation(LDA))的存在下检测代码克隆可实现84%的准确性和94%的查全率,而在PDG的情况下使用GRAPLE进行检测,我们可以实现55%的查准率和29%的查全率。这些结果表明注释可用于查找语义克隆。我们建议使用带有LDA的注释来在文件级别上找到克隆,并使用PDG进行代码来在功能级别上找到克隆。这些发现有必要重新审查有关语义克隆检测技术的假设。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号