首页> 外文会议>International Conference on Software Analysis, Evolution, and Reengineering >Beyond support and confidence: Exploring interestingness measures for rule-based specification mining
【24h】

Beyond support and confidence: Exploring interestingness measures for rule-based specification mining

机译:超越支持和信心:探索基于规则的规范挖掘的有趣措施

获取原文

摘要

Numerous rule-based specification mining approaches have been proposed in the literature. Many of these approaches analyze a set of execution traces to discover interesting usage rules, e.g., whenever lock() is invoked, eventually unlock() is invoked. These techniques often generate and enumerate a set of candidate rules and compute some interestingness scores. Rules whose interestingness scores are above a certain threshold would then be output. In past studies, two measures, namely support and confidence, which are well-known measures, are often used to compute these scores. However, aside from these two, many other interestingness measures have been proposed. It is thus unclear if support and confidence are the best interestingness measures for specification mining. In this work, we perform an empirical study that investigates the utility of 38 interestingness measures in recovering correct specifications of classes from Java libraries. We used a ground truth dataset consisting of 683 rules and recorded execution traces that are produced when we run the DaCapo test suite. We apply 38 different interestingness measures to identify correct rules from a pool of candidate rules. Our study highlights that many measures are on par to support and confidence. Some of the measures are even better than support or confidence and at least one of the measures is statistically significantly better than the two measures. We also find that compositions of several measures with support statistically significantly outperform the composition of support and confidence. Our findings highlight the need to look beyond standard support and confidence to find interesting rules.
机译:在文献中提出了许多基于规则的规范挖掘方法。这些方法中的许多方法分析了一组执行跟踪,以发现有趣的使用规则,例如,每当调用锁定()时,最终都会解锁()。这些技术通常会生成并枚举一组候选规则,并计算一些有趣的分数。然后将输出有趣分数的规则,然后将输出超过某个阈值。在过去的研究中,两种措施,即支持和信心,这些措施通常用于计算这些分数。然而,除了这两个之外,已经提出了许多其他有趣的措施。因此,如果支持和信心是规范采矿的最佳有趣措施,因此不明确。在这项工作中,我们执行一个实证研究,调查38个有趣措施从Java库中恢复正确规格的效用。我们使用了由683规则组成的地面真理数据集,并在运行Dacapo测试套件时产生的录制执行痕迹。我们申请38种不同的有趣措施,以确定候选规则池中的正确规则。我们的研究突出了许多措施,以支持和信心。一些措施甚至比支持或信心更好,至少有一个措施比两项措施统计上显着更好。我们还发现,在统计上显着优于支撑和信心的构成,若干措施的组合。我们的调查结果强调了超越标准支持和信心的需要找到有趣的规则。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号