【24h】

Does Principal Component Analysis Improve Cluster-Based Analysis?

机译:主成分分析会改善基于聚类的分析吗?

获取原文

摘要

Researchers in the dynamic program analysis field have extensively used cluster analysis to address various problems. Typically, the clustering techniques are applied onto execution profiles having high dimensionality (i.e., involving a large number of profiling elements), sometimes in the order of thousands or even hundreds of thousands. Our concern is that the high number of profiling elements might diminish the effectiveness of the clustering process, which led us to explore the use of dimensionality reduction techniques as a preprocessing step to clustering. Specifically, in this work, we used PCA (Principal Component Analysis) as a dimensionality reduction technique and investigated its impact on two cluster-based analysis techniques, one aiming at identifying coincidentally correct tests, and the other at test suite minimization. In other words, we tried to assess whether PCA improves cluster-based analysis. Our experimental results showed that the impact was positive on the first technique, but inconclusive on the second, which calls for further investigation in the future.
机译:动态程序分析领域的研究人员已广泛使用聚类分析来解决各种问题。通常,将聚类技术应用于有时具有成千上万甚至数十万个数量级的具有高维度(即,涉及大量的概要分析元素)的执行配置文件。我们担心的是,大量的分析元素可能会降低聚类过程的有效性,这导致我们探索将降维技术用作聚类的预处理步骤。具体来说,在这项工作中,我们使用PCA(主成分分析)作为降维技术,并研究了其对两种基于聚类的分析技术的影响,一种旨在识别巧合的正确测试,另一种旨在最小化测试套件。换句话说,我们试图评估PCA是否可以改善基于聚类的分析。我们的实验结果表明,对第一种技术的影响是积极的,但对第二种技术却没有定论,这需要在将来进行进一步的研究。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号