首页> 外文期刊>SoftwareX >Cnerator: A Python application for the controlled stochastic generation of standard C source code
【24h】

Cnerator: A Python application for the controlled stochastic generation of standard C source code

机译:CNerator:用于受控随机生成标准C源代码的Python应用程序

获取原文
           

摘要

The Big Code and Mining Software Repositories research lines analyze large amounts of source code to improve software engineering practices. Massive codebases are used to train machine learning models aimed at improving the software development process. One example is decompilation, where C code and its compiled binaries can be used to train machine learning models to improve decompilation. However, obtaining massive codebases of portable C code is not an easy task, since most applications use particular libraries, operating systems, or language extensions. In this paper, we present Cnerator, a Python application that provides the stochastic generation of large amounts of standard C code. It is highly configurable, allowing the user to specify the probability distributions of each language construct, properties of the generated code, and post-processing modifications of the output programs. Cnerator has been successfully used to generate code that, utilized to train machine learning models, has improved the performance of existing decompilers. It has also been used in the implementation of an infrastructure for the automatic extraction of code patterns.
机译:大代码和挖掘软件存储库研究系分析了大量的源代码以改善软件工程实践。大规模的CodeBases用于培训机器学习模型,旨在改善软件开发过程。一个示例是反作用,其中C代码及其编译的二进制文件可用于训练机器学习模型以改善反作化。但是,获取便携式C代码的大量CodeBases不是一项简单的任务,因为大多数应用程序使用特定库,操作系统或语言扩展。在本文中,我们呈现Cnertator,一个Python应用程序提供了大量标准C代码的随机生成。它是高度可配置的,允许用户指定每个语言构造的概率分布,生成的代码的属性以及输出程序的后处理修改。 CNerator已成功地用于生成代码,利用用于训练机器学习模型,提高了现有分解器的性能。它还已被用于实施用于自动提取代码模式的基础设施。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号