【24h】

Code Type Revealing Using Experiments Framework

机译:使用实验框架显示的代码类型

获取原文

摘要

Identifying the type of a code, whether in a file or byte stream, is a challenge that many software companies are facing. Many applications, security and others, base their behavior on the type of code they receive as an input. Today's traditional identification methods rely on file extensions, magic numbers, propriety headers and trailers or specific type identifying rules. All these are vulnerable to content tampering and discovering it requires investing long and tedious working hours of professionals. This study is aimed to find a method of identifying the best settings to automatically create type signatures that will effectively overcome the content manipulation problem. In this paper we lay out a framework for creating type signatures based on byte N-Grams. The framework allows setting various parameters such as N-Gram sizes and windows, selecting statistical tests and defining rules for score calculations. The framework serves as a test lab that allows finding the right parameters to satisfy a predefined threshold of type identification accuracy. We demonstrate the framework using basic settings that achieved an F-Measure success rate of 0.996 on 1400 test files.
机译:识别代码的类型,无论是文件还是字节流,都是许多软件公司面临的挑战。许多应用程序,安全性和其他应用程序,将其行为基于他们作为输入接收的代码类型。今天的传统识别方法依赖于文件扩展,魔术号,礼仪标题和拖车或特定类型识别规则。所有这些都容易受到内容篡改和发现它需要投资长期和乏味的专业工作时间。本研究旨在找到一种识别最佳设置,以自动创建将有效地克服内容操作问题的类型签名。在本文中,我们介绍了一个基于字节n-gram创建型签名的框架。该框架允许设置各种参数,例如n克大小和窗口,选择统计测试并定义分数计算规则。该框架用作测试实验室,允许找到正确的参数以满足类型识别精度的预定义阈值。我们使用基本设置演示框架,该设置在1400个测试文件上实现了0.996的F测量成功率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号