首页> 外文会议>International Workshop on Software Clones >Clone Detection on Large Scala Codebases
【24h】

Clone Detection on Large Scala Codebases

机译:大型Scala代码库上的克隆检测

获取原文

摘要

Code clones are identical or similar code segments. The wide existence of code clones can increase the cost of maintenance and jeopardise the quality of software. The research community has developed many techniques to detect code clones, however, there is little evidence of how these techniques may perform in industrial use cases. In this paper, we aim to uncover the differences when such techniques are applied in industrial use cases. We conducted large scale experimental research on the performance of two state-of-the-art code clone detection techniques, SourcererCC and AutoenCODE, on both open source projects and an industrial project written in the Scala language. Our results reveal that both algorithms perform differently on the industrial project, with the largest drop in precision being 30.7%, and the largest increase in recall being 32.4%. By manually labelling samples of the industrial project by its developers, we discovered that there are substantially less Type-3 clones in the aforementioned project than that in the open source projects.
机译:代码克隆是相同或相似的代码段。代码克隆的广泛存在会增加维护成本并危及软件质量。研究团体已经开发了许多技术来检测代码克隆,但是,几乎没有证据表明这些技术在工业用例中会如何发挥作用。在本文中,我们旨在揭示将这些技术应用于工业用例时的差异。我们在开源项目和用Scala语言编写的工业项目中,对两种最先进的代码克隆检测技术SourcererCC和AutoenCODE的性能进行了大规模的实验研究。我们的结果表明,两种算法在工业项目上的执行效果都不同,精度下降最大的是30.7%,召回率的最大增长是32.4%。通过由开发人员手动标记工业项目的样本,我们发现上述项目中的Type-3克隆要比开源项目中的要少得多。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号