Deobfuscation, unpacking, and decoding of obfuscated malicious JavaScript for machine learning models detection performance improvement

Samuel Ndichu; Sangwook Kim; Seiichi Ozawa

首页> 外文期刊>Caai Transactions on Intelligence Technology >Deobfuscation, unpacking, and decoding of obfuscated malicious JavaScript for machine learning models detection performance improvement

【24h】

Deobfuscation, unpacking, and decoding of obfuscated malicious JavaScript for machine learning models detection performance improvement

机译：用于机器学习模型检测性能改进的机器学习模型的混淆恶意JavaScript的Deobfutation，打开包装和解码

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Obfuscation is rampant in both benign and malicious JavaScript (JS) codes. It generates an obscure and undetectable code that hinders comprehension and analysis. Therefore, accurate detection of JS codes that masquerade as innocuous scripts is vital. The existing deobfuscation methods assume that a specific tool can recover an original JS code entirely. For a multi-layer obfuscation, general tools realize a formatted JS code, but some sections remain encoded. For the detection of such codes, this study performs Deobfuscation, Unpacking, and Decoding (DUD-preprocessing) by function redefinition using a Virtual Machine (VM), a JS code editor, and a python int_to_str() function to facilitate feature learning by the FastText model. The learned feature vectors are passed to a classifier model that judges the maliciousness of a JS code. In performance evaluation, the authors use the Hynek Petrak's dataset for obfuscated malicious JS codes and the SRILAB dataset and the Majestic Million service top 10,000 websites for obfuscated benign JS codes. They then compare the performance to other models on the detection of DUD-preprocessed obfuscated malicious JS codes. Their experimental results show that the proposed approach enhances feature learning and provides improved accuracy in the detection of obfuscated malicious JS codes.

机译：良性和恶意JavaScript（JS）代码中的混淆是猖獗的。它产生了一个晦涩和不可检测的代码，阻碍了理解和分析。因此，准确地检测伪装成无害脚本的JS代码至关重要。现有的deobfuscation方法假设特定工具可以完全恢复原始JS代码。对于多层混淆，常规工具实现了格式化的JS代码，但某些部分仍然是编码的。为了检测此类代码，通过使用虚拟机（VM），JS代码编辑器和Python，通过函数重新定义执行Duobfuscation int_to_str（）功能促进FastText模型的特征学习。学习的特征向量传递给判断JS代码恶意的分类器模型。在绩效评估中，作者使用Hynek Petrak的DataSet用于混淆恶意JS代码和SRILAB数据集和MAJEGENTEC百万服务前10,000个网站，用于混淆良性JS代码。然后，它们将性能与其他模型进行比较检测DUD预处理的混淆恶意JS代码。他们的实验结果表明，该方法增强了特征学习，并在检测到混淆恶意JS代码方面提供了提高的准确性。

著录项

来源
《Caai Transactions on Intelligence Technology》 |2020年第3期|184-192|共9页
作者
Samuel Ndichu; Sangwook Kim; Seiichi Ozawa;
展开▼
作者单位

Graduate School of Engineering Kobe University Kobe City Japan;

Graduate School of Engineering Kobe University Kobe City Japan;

Graduate School of Engineering Kobe University Kobe City Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
invasive software; Java; Internet; feature extraction; text analysis; vectors; learning (artificial intelligence);

机译：侵入性软件;Java;互联网;特征提取;文本分析;向量;学习（人工智能）;

相似文献

外文文献
中文文献
专利

1. Obfuscated Malicious JavaScript Detection Scheme Using the Feature on Divided URL [J] . Shoya MORISHIGE, Shuichiro HARUTA, Hiromu ASAHINA, 電子情報通信学会技術研究報告. 通信方式. Communication Systems . 2017,第156期

机译：使用划分的URL上的功能滥用恶意JavaScript检测方案
2. Trust in Intrusion Detection Systems: An Investigation of Performance Analysis for Machine Learning and Deep Learning Models [J] . Basim Mahbooba, Radhya Sahal, Wael Alosaimi, Complexity . 2021,第a期

机译：信任入侵检测系统：对机器学习和深层学习模型进行性能分析的调查
3. A machine learning approach to detection of JavaScript-based attacks using AST features and paragraph vectors [J] . Ndichu Samuel, Kim Sangwook, Ozawa Seiichi, Applied Soft Computing . 2019,第期

机译：使用AST功能和段落向量检测基于JavaScript的攻击的机器学习方法
4. Obfuscated Malicious JavaScript Detection by Machine Learning [C] . Jinkun Pan, Xiaoguang Mao International Conference on Advances in Mechanical Engineering and Industrial Informatics . 2016

机译：通过机器学习滥用恶意javascript检测
5. Detection, Diagnosis and Mitigation of Malicious Javascript with Enriched Javascript Executions [D] . Hu, Xunchao. 2017

机译：具有丰富的Javascript执行功能的恶意Javascript的检测，诊断和缓解
6. Obfuscation of Malicious Behaviors for Thwarting Masquerade Detection Systems Based on Locality Features [O] . Jorge Maestre Vidal, Marco Antonio Sotelo Monge 2020

机译：基于局部特征的节制化装检测系统的恶意行为混淆
7. Obfuscated Malicious JavaScript Detection by Machine Learning [O] . Jinkun Pan, Xiaoguang Mao 2016

机译：通过机器学习滥用恶意javascript检测

Deobfuscation, unpacking, and decoding of obfuscated malicious JavaScript for machine learning models detection performance improvement

摘要

著录项

相似文献

相关主题

期刊订阅