首页> 外文期刊>Caai Transactions on Intelligence Technology >Deobfuscation, unpacking, and decoding of obfuscated malicious JavaScript for machine learning models detection performance improvement
【24h】

Deobfuscation, unpacking, and decoding of obfuscated malicious JavaScript for machine learning models detection performance improvement

机译:用于机器学习模型检测性能改进的机器学习模型的混淆恶意JavaScript的Deobfutation,打开包装和解码

获取原文
获取原文并翻译 | 示例
       

摘要

Obfuscation is rampant in both benign and malicious JavaScript (JS) codes. It generates an obscure and undetectable code that hinders comprehension and analysis. Therefore, accurate detection of JS codes that masquerade as innocuous scripts is vital. The existing deobfuscation methods assume that a specific tool can recover an original JS code entirely. For a multi-layer obfuscation, general tools realize a formatted JS code, but some sections remain encoded. For the detection of such codes, this study performs Deobfuscation, Unpacking, and Decoding (DUD-preprocessing) by function redefinition using a Virtual Machine (VM), a JS code editor, and a python int_to_str() function to facilitate feature learning by the FastText model. The learned feature vectors are passed to a classifier model that judges the maliciousness of a JS code. In performance evaluation, the authors use the Hynek Petrak's dataset for obfuscated malicious JS codes and the SRILAB dataset and the Majestic Million service top 10,000 websites for obfuscated benign JS codes. They then compare the performance to other models on the detection of DUD-preprocessed obfuscated malicious JS codes. Their experimental results show that the proposed approach enhances feature learning and provides improved accuracy in the detection of obfuscated malicious JS codes.
机译:良性和恶意JavaScript(JS)代码中的混淆是猖獗的。它产生了一个晦涩和不可检测的代码,阻碍了理解和分析。因此,准确地检测伪装成无害脚本的JS代码至关重要。现有的deobfuscation方法假设特定工具可以完全恢复原始JS代码。对于多层混淆,常规工具实现了格式化的JS代码,但某些部分仍然是编码的。为了检测此类代码,通过使用虚拟机(VM),JS代码编辑器和Python,通过函数重新定义执行Duobfuscation int_to_str() 功能促进FastText模型的特征学习。学习的特征向量传递给判断JS代码恶意的分类器模型。在绩效评估中,作者使用Hynek Petrak的DataSet用于混淆恶意JS代码和SRILAB数据集和MAJEGENTEC百万服务前10,000个网站,用于混淆良性JS代码。然后,它们将性能与其他模型进行比较检测DUD预处理的混淆恶意JS代码。他们的实验结果表明,该方法增强了特征学习,并在检测到混淆恶意JS代码方面提供了提高的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号