首页> 外文期刊>International journal of software engineering and knowledge engineering >Predicting Code Hotspots in Open-Source Software from Object-Oriented Metrics Using Machine Learning
【24h】

Predicting Code Hotspots in Open-Source Software from Object-Oriented Metrics Using Machine Learning

机译:使用机器学习从面向对象的指标预测开源软件中的代码热点

获取原文
获取原文并翻译 | 示例
           

摘要

Software engineers are able to measure the quality of their code using a variety of metrics that can be derived directly from analyzing the source code. These internal quality metrics are valuable to engineers, but the organizations funding the software development effort find external quality metrics such as defect rates and time to develop features more valuable. Unfortunately, external quality metrics can only be calculated after costly software has been developed and deployed for end-users to utilize. Here, we present a method for mining data from freely available open source codebases written in Java to train a Random Forest classifier to predict which files are likely to be external quality hotspots based on their internal quality metrics with over 75% accuracy. We also used the trained model to predict hotspots for a Java project whose data was not used to train the classifier and achieved over 75% accuracy again, demonstrating the method's general applicability to different projects.
机译:软件工程师能够使用可以直接从源代码分析中得出的各种指标来衡量其代码的质量。这些内部质量指标对工程师来说很有价值,但是资助软件开发工作的组织发现外部质量指标,例如缺陷率和开发功能的时间更有价值。不幸的是,只有在开发并部署了昂贵的软件供最终用户使用之后,才能计算外部质量指标。在这里,我们提出了一种方法,该方法可从Java编写的免费开放源代码库中挖掘数据,以训练随机森林分类器根据其内部质量指标(其准确性超过75%)预测哪些文件可能是外部质量热点。我们还使用训练有素的模型来预测Java项目的热点,该Java项目的数据未用于训练分类器,并且再次达到了75%以上的准确性,这证明了该方法在不同项目中的普遍适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号