首页> 外国专利> Encoding machine code instructions for static feature based malware clustering

Encoding machine code instructions for static feature based malware clustering

机译:编码基于静态功能的恶意软件群集的机器代码说明

摘要

Machine language instruction sequences of computer files are extracted and encoded into standardized opcode sequences. The standardized opcodes in the sequences are of the same length and do not include operands. A multi-dimension vector is generated as a static feature for each computer file, where each element in the vector corresponds to the number of occurrences of a unique N-gram (i.e., unique sequence of N consecutive standardized opcodes) in the standardized opcode sequence for that computer file. The computer files are clustered into clusters of similarly classified files based on similarities of their static features. An unknown computer file can be classified by first grouping the file into a cluster of files with similar static features (e.g., into the cluster with the shortest average distance), and then determining the classification of that file based on the classifications of other files that belong to the same cluster.
机译:提取计算机文件的机器语言指令序列,并将其编码为标准化的操作码序列。序列中的标准化操作码具有相同的长度,并且不包含操作数。为每个计算机文件生成一个多维向量作为静态特征,其中向量中的每个元素对应于标准化操作码序列中唯一N元语法(即,N个连续标准化操作码的唯一序列)的出现次数该计算机文件。根据计算机文件的静态特征的相似性,将它们聚类为类似分类文件的集群。可以通过以下方式对未知的计算机文件进行分类:首先将文件分组为具有相似静态特征的文件集群(例如,分组为平均距离最短的集群),然后根据其他文件的分类确定该文件的分类属于同一集群。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号