【24h】

Automated Localization for Unreproducible Builds

机译:自动化本地化,用于不可复制的构建

获取原文
获取原文并翻译 | 示例

摘要

Reproducibility is the ability of recreating identical binaries under pre-defined build environments. Due to the need of quality assurance and the benefit of better detecting attacks against build environme nts, the practice of reproducible builds has gained popularity in many open-source software repositories such as Debian and Bitcoin. However, identifying the unreproducible issues remains a labour intensive and time consuming challenge, because of the lacking of information to guide the search and the diversity of the causes that may lead to the unreproducible binaries. In this paper we propose an automated framework called RepLoc to localize the problematic files for unreproducible builds. RepLoc features a query augmentation component that utilizes the information extracted from the build logs, and a heuristic rule-based filtering component that narrows the search scope. By integrating the two components with a weighted file ranking module, RepLoc is able to automatically produce a ranked list of files that are helpful in locating the problematic files for the unreproducible builds. We have implemented a prototype and conducted extensive experiments over 671 real-world unreproducible Debian packages in four different categories. By considering the topmost ranked file only, RepLoc achieves an accuracy rate of 47.09%. If we expand our examination to the top ten ranked files in the list produced by RepLoc, the accuracy rate becomes 79.28%. Considering that there are hundreds of source code, scripts, Makefiles, etc., in a package, RepLoc significantly reduces the scope of localizing problematic files. Moreover, with the help of RepLoc, we successfully identified and fixed six new unreproducible packages from Debian and Guix.
机译:可再现性是在预定义的构建环境下重新创建相同二进制文件的能力。由于质量保证的需要以及更好地检测针对构建环境的攻击的好处,可复制构建的做法在许多开源软件存储库(例如Debian和Bitcoin)中已变得越来越流行。但是,由于缺乏指导搜索的信息以及可能导致二进制文件不可复制的原因的多样性,确定不可复制的问题仍然是一项劳动密集型且耗时的挑战。在本文中,我们提出了一个称为RepLoc的自动化框架,用于将有问题的文件本地化以用于不可复制的构建。 RepLoc具有利用从构建日志中提取的信息的查询增强组件,以及缩小搜索范围的基于启发式规则的过滤组件。通过将这两个组件与加权文件排名模块集成在一起,RepLoc能够自动生成文件的排名列表,这有助于为无法复制的构建找到有问题的文件。我们已经实现了原型,并针对四个不同类别的671个现实世界中不可复制的Debian软件包进行了广泛的实验。仅考虑排名最高的文件,RepLoc的准确率就达到47.09%。如果将检查范围扩大到RepLoc生成的列表中排名前十的文件,则准确率将达到79.28%。考虑到软件包中包含数百个源代码,脚本,Makefile等,RepLoc大大减小了对有问题文件进行本地化的范围。此外,在RepLoc的帮助下,我们成功地确定并修复了来自Debian和Guix的六个新的不可复制的软件包。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号