SemanticCloneBench: A Semantic Code Clone Benchmark using Crowd-Source Knowledge

机译：SemanticCloneBench：使用人群源知识的语义代码克隆基准

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Not only do newly proposed code clone detection techniques, but existing techniques and tools also need to be evaluated and compared. This evaluation process could be done by assessing the reported clones manually or by using benchmarks. The main limitations of available benchmarks include: they are restricted to one programming language; they have a limited number of clone pairs that are confined within the selected system(s); they require manual validation; they do not support all types of code clones. To overcome these limitations, we proposed a methodology to generate a wide range of semantic clone benchmark(s) for different programming languages with minimal human validation. Our technique is based on the knowledge provided by developers who participate in the crowd-sourced information website, Stack Overflow. We applied automatic filtering, selection and validation to the source code in Stack Overflow answers. Finally, we build a semantic code clone benchmark of 4000 clones pairs for the languages Java, C, C# and Python.

机译：新提出的代码克隆检测技术不仅需要，而且现有技术和工具也需要进行评估和比较。可以通过手动评估报告的克隆或使用基准来完成此评估过程。可用基准的主要限制包括：它们只能使用一种编程语言。它们具有有限数量的克隆对，这些克隆对仅限于所选系统内;他们需要人工验证;它们不支持所有类型的代码克隆。为了克服这些限制，我们提出了一种方法，可以用最少的人工验证为不同的编程语言生成广泛的语义克隆基准。我们的技术基于参与众包信息网站Stack Overflow的开发人员提供的知识。我们对Stack Overflow答案中的源代码应用了自动过滤，选择和验证。最后，我们针对Java，C，C＃和Python语言建立了4000个克隆对的语义代码克隆基准。

著录项

来源
《International Workshop on Software Clones》|2020年|57-63|共7页
会议地点
作者
Farouq Al-Omari; Chanchal K. Roy; Tonghao Chen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
crowdsourcing; Java; software engineering; software maintenance; software tools; source code (software);

机译：众包; Java;软件工程;软件维护;软件工具;源代码（软件）;

相似文献

外文文献
中文文献
专利

1. A detection framework for semantic code clones and obfuscated code [J] . Sheneamer Abdullah, Roy Swarup, Kalita Jugal Expert Systems with Application . 2018,第MAY期

机译：语义代码克隆和混淆代码的检测框架
2. A dynamic problem to knowledge linking Semantic Web service based on clinical codes. [J] . Kamel Boulos MN, Roudsari AV, Carson ER Medical informatics and the Internet in medicine . 2002,第3期

机译：基于临床代码的知识链接语义Web服务的动态问题。
3. Benchmarking neural embeddings for link prediction in knowledge graphs under semantic and structural changes [J] . Agibetov Asan, Samwald Matthias Journal of web semantics: . 2020,第Octa期

机译：基于语义与结构变化下知识图中的链路预测基准测试神经嵌入
4. SemanticCloneBench: A Semantic Code Clone Benchmark using Crowd-Source Knowledge [C] . Farouq Al-Omari, Chanchal K. Roy, Tonghao Chen International Workshop on Software Clones . 2020

机译：SemicanticCloneBench：使用人群源知识的语义代码克隆基准
5. A Polynomial Time Procedure Converting Error Correcting Codes to Semantically Secure Wiretap Codes [D] . Kubischta, Eric. 2018

机译：多项式时间过程，将纠错码转换为语义上安全的窃听码
6. Semantic Health Knowledge Graph: Semantic Integration of Heterogeneous Medical Knowledge and Services [O] . Longxiang Shi, Shijian Li, Xiaoran Yang, 2006

机译：语义健康知识图：异构医学知识和服务的语义集成
7. Benchmarking pre‐trained Encoders for real‐time Semantic Road Scene Segmentation [O] . Lennart Evers 2021

机译：基准测试预先培训的编码器，用于实时语义路面分割

SemanticCloneBench: A Semantic Code Clone Benchmark using Crowd-Source Knowledge

摘要

著录项

相似文献

相关主题

期刊订阅