首页> 外文会议>International Workshop on Software Clones >Clone detection using rolling hashing, suffix trees and dagification: A case study
【24h】

Clone detection using rolling hashing, suffix trees and dagification: A case study

机译:克隆检测使用滚动散列,后缀树和达吉化:一个案例研究

获取原文

摘要

Microsoft Dynamics NAV is a widely used enterprise resource planning system for small and medium-sized enterprises that, by design, encourages rapid customization by copy-paste programming. We report the results of analyzing clone detection for NAV using two previously published methods and one new algorithmic method: character-based sliding window sampling using Rabin-Karp hashing (MOSS), line-based sequence matching using suffix trees (CodeDup), and abstract-syntax-tree based graph sharing analysis (XMLClone). The latter is piggybacked on XMLStore, which stores XML trees as directed acyclic graphs (dags) where all isomorphic subtrees are identified and coalesced into single nodes, which can be done in linear time using multiset discrimination. This dagification discovers all well-formed Type-1 and, with suitable input normalization, Type-2 clones. We find that the subsequent dag analysis to discover Type-3 clones performs well on NAV source code, both in terms of computational complexity and precision. This suggests that efficient dagification and independently configurable dag interpretation may be valuable ingredients for modular clone detection.
机译:Microsoft Dynamics NAV是一个广泛使用的企业资源规划系统,用于中小企业,通过设计,鼓励通过复制粘贴编程快速定制。我们报告了使用两个先前发布的方法和一种新的算法方法分析克隆检测的结果:使用Rabin-Karp散列(MOSS),基于线路的滑动窗口采样,使用后缀树(CODEDUP)和摘要,匹配基于字符的滑动窗口采样。 -Syntax-Tree基于图形共享分析(XMLClone)。后者在XMLStore上捎带,该XMLStore将XML树存储为定向的非循环图(DAG),其中所有相位子树都被识别并结合到单个节点中,可以使用多立方辨别在线性时间来完成。这种逐渐发现了所有良好的类型-1,具有合适的输入标准化,2型克隆。我们发现,在计算复杂性和精度方面,随后的DAG分析在NAV源代码中执行良好。这表明有效的爆破和可独立的可配置的DAG解释可能是模块化克隆检测的有价值的成分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号