首页> 外文期刊>The Computer journal >Using Decision Tree Classifiers in Source Code Analysis to Recognize Algorithms: An Experiment with Sorting Algorithms
【24h】

Using Decision Tree Classifiers in Source Code Analysis to Recognize Algorithms: An Experiment with Sorting Algorithms

机译:在源代码分析中使用决策树分类器识别算法:排序算法实验

获取原文
获取原文并翻译 | 示例
           

摘要

We discuss algorithm recognition (AR) and present a method for recognizing algorithms automatically from Java source code. The method consists of two phases. In the first phase, the recognizable algorithms are converted into the vectors of characteristics, which are computed based on static analysis of program code, including various statistics of language constructs and analysis of Roles of Variables in the target program. In the second phase, the algorithms are classified based on these vectors using the C4.5 decision tree classifier. We demonstrate the performance of the method by applying it to sorting algorithms. Using leave-one-out cross-validation technique, we have conducted an experimental evaluation of the classification performance showing that the average classification accuracy is 98.1% (the data set consisted of five different types of sorting algorithms). The results show the applicability and usefulness of roles of variables in AR, and illustrate that the C4.5 algorithm is a suitable decision tree classifier for our purpose. The limitations of the method are also discussed.
机译:我们讨论了算法识别(AR),并提出了一种从Java源代码自动识别算法的方法。该方法包括两个阶段。在第一阶段,将可识别的算法转换为特征向量,这些向量是基于程序代码的静态分析而计算的,包括对语言结构的各种统计信息以及对目标程序中变量作用的分析。在第二阶段,使用C4.5决策树分类器基于这些向量对算法进行分类。我们通过将其应用于排序算法来演示该方法的性能。使用留一法交叉验证技术,我们对分类性能进行了实验评估,结果表明平均分类精度为98.1%(该数据集由五种不同类型的排序算法组成)。结果表明变量在AR中的适用性和实用性,并说明C4.5算法是适合我们目的的决策树分类器。还讨论了该方法的局限性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号