Semantic Clone Detection: Can Source Code Comments Help?

机译：语义克隆检测：源代码注释可以帮助吗？

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Programmers reuse code to increase their productivity, which leads to large fragments of duplicate or near-duplicate code in the code base. The current code clone detection techniques for finding semantic clones utilize Program Dependency Graphs (PDG), which are expensive and resource-intensive. PDG and other clone detection techniques utilize code and have completely ignored the comments - due to ambiguity of English language, but in terms of program comprehension, comments carry the important domain knowledge. We empirically evaluated the accuracy of detecting clones with both code and comments on a JHotDraw package. Results show that detecting code clones in the presence of comments, Latent Dirichlet Allocation (LDA), gave 84% precision and 94% recall, while in the presence of a PDG, using GRAPLE, we got 55% precision and 29% recall. These results indicate that comments can be used to find semantic clones. We recommend utilizing comments with LDA to find clones at the file level and code with PDG for finding clones at the function level. These findings necessitate a need to reexamine the assumptions regarding semantic clone detection techniques.

机译：程序员重用代码以提高生产率，这会导致代码库中出现大量重复或接近重复的代码。用于查找语义克隆的当前代码克隆检测技术利用程序依赖图（PDG），它很昂贵且占用大量资源。 PDG和其他克隆检测技术使用代码，并且完全忽略了注释-由于英语的歧义，但是就程序理解而言，注释具有重要的领域知识。我们根据经验评估了在JHotDraw软件包上使用代码和注释检测克隆的准确性。结果表明，在注释（潜在狄利克雷分配）（Latent Dirichlet Allocation（LDA））的存在下检测代码克隆可实现84％的准确性和94％的查全率，而在PDG的情况下使用GRAPLE进行检测，我们可以实现55％的查准率和29％的查全率。这些结果表明注释可用于查找语义克隆。我们建议使用带有LDA的注释来在文件级别上找到克隆，并使用PDG进行代码来在功能级别上找到克隆。这些发现有必要重新审查有关语义克隆检测技术的假设。

著录项

来源
《IEEE Symposium on Visual Languages and Human-Centric Computing》|2018年|315-317|共3页
会议地点
作者
Akash Ghosh; Sandeep Kaur Kuttal;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Cloning; Semantics; Software maintenance; Software systems; Reverse engineering; Visualization;

机译：克隆;语义;软件维护;软件系统;逆向工程;可视化;

相似文献

外文文献
中文文献
专利

1. A detection framework for semantic code clones and obfuscated code [J] . Sheneamer Abdullah, Roy Swarup, Kalita Jugal Expert Systems with Application . 2018,第MAY期

机译：语义代码克隆和混淆代码的检测框架
2. Semantic code clone detection for Internet of Things applications using reaching definition and liveness analysis [J] . Rajkumar Tekchandani, Rajesh Bhatia, Maninder Singh Journal of supercomputing . 2018,第9期

机译：使用到达定义和活动度分析对物联网应用程序进行语义代码克隆检测
3. Semantically Enhanced Code Clone Refinement Algorithm Based on Analysis of Multiple Detection Reports [J] . Ricardo Sotolongo, Fangyan Dong, Kaoru Hirota Journal of Advanced Computatioanl Intelligence and Intelligent Informatics . 2011,第3a82期

机译：基于多重检测报告分析的语义增强代码克隆优化算法
4. Semantic Clone Detection: Can Source Code Comments Help? [C] . Akash Ghosh, Sandeep Kaur Kuttal IEEE Symposium on Visual Languages and Human-Centric Computing . 2018

机译：语义克隆检测：可以源代码评论有助于帮助吗？
5. Reengineering user comments: Providing support for source code reading during program comprehension. [D] . Sandiford, Caliope. 2007

机译：重新设计用户注释：在程序理解期间为源代码阅读提供支持。
6. Semantic based concept differential retrieval equivalence detection in clinical terms version 3 (Read Codes). [O] . P. J. Brown, C. Price 1999

机译：临床术语第3版（阅读代码）中基于语义的概念差异检索和对等检测。
7. CCFinder: A Multilinguistic Token-Based Code Clone Detection System for Large Scale Source Code [O] . Inoue Katsuro 2015

机译：CCFinder：一种用于大规模源代码的基于多语言令牌的代码克隆检测系统

Semantic Clone Detection: Can Source Code Comments Help?

摘要

著录项

相似文献

相关主题

期刊订阅