首页> 外文会议>IEEE International Conference on Software Analysis, Evolution and Reengineering >MSR4ML: Reconstructing Artifact Traceability in Machine Learning Repositories
【24h】

MSR4ML: Reconstructing Artifact Traceability in Machine Learning Repositories

机译:MSR4ML:在机器学习存储库中重建伪影可追溯性

获取原文

摘要

The increasing popularity of Machine Learning (ML) is generating challenges also for developers. The multitude of programming languages, libraries and available resources allow them to easily build their own models or algorithms. However, ML models are tightly connected to their data implying a different development process from other types of software. Software projects often rely on version control platforms, such as GitHub, but these platforms have not yet been extended to support ML projects. There is poor support for data versioning and no link between ML and software artifacts. Thus, traceability and model evolution can become challenging for developers. While some specific ML platforms exist, they still require considerable manual specification of ML artifacts and links between them. In this work, we propose a framework for automatic identification and traceability of links between data, code and ML model through Mining Software Repositories (MSR) techniques. Our tool combines static code analysis and mining commit data to identify ML, code and data artifacts, reconstruct links between them and retrieve commits that affect each end of the link. The objective is to increase productivity and the developers’ awareness of their project through the recovered traceability.
机译:机器学习的普及越来越普及(ML)也为开发人员产生挑战。众多的编程语言,库和可用资源允许它们轻松构建自己的模型或算法。然而,ML模型紧密地连接到他们的数据,这意味着来自其他类型的软件的不同的开发过程。软件项目通常依赖于版本控制平台,例如GitHub,但这些平台尚未扩展到支持ML项目。对数据版本控制的支持不佳,ML和软件工件之间没有链接。因此,可追溯性和模型进化可能会对开发人员挑战。虽然存在一些特定的ML平台,但它们仍然需要相当多的手动规范ML伪像和它们之间的链接。在这项工作中,我们通过挖掘软件存储库(MSR)技术提出了一种用于数据,代码和ML模型之间的链接的自动识别和可追溯性的框架。我们的工具将静态代码分析和挖掘提交数据结合起来识别ML,代码和数据工件,重建它们之间的链接并检索影响链路每端的提交。目标是通过回收的可追溯性提高生产力和开发人员对其项目的认识。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号