A Phishing Webpage Detection Method Based on Stacked Autoencoder and Correlation Coefficients

Jian Feng; Lianyang Zou; Tianzhu Nan

首页> 外文期刊>Journal of Computing and Information Technology >A Phishing Webpage Detection Method Based on Stacked Autoencoder and Correlation Coefficients

【24h】

A Phishing Webpage Detection Method Based on Stacked Autoencoder and Correlation Coefficients

机译：基于堆叠式自动编码器和相关系数的钓鱼网页检测方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Phishing is a kind of cyber-attack that targets naive online users by tricking them into revealing sensitive information. There are many anti-phishing solutions proposed to date, such as blacklist or whitelist, heuristic-based and machine learning-based methods. However, online users are still being trapped into revealing sensitive information in phishing websites. In this paper, we propose a novel phishing webpage detection model, based on features that are extracted from URL, source codes of HTML, and the third-party services to represent the basic characters of phishing webpages, which uses a deep learning method - Stacked Autoencoder (SAE) to detect phishing webpages. To make features in the same order of magnitude, three kinds of normalization methods are adopted. In particular, a method to calculate correlation coefficients between weight matrixes of SAE is proposed to determine optimal width of hidden layers, which shows high computational efficiency and feasibility. Based on the testing of a set of phishing and benign webpages, the model using SAE achieves the best performance when compared to other algorithms such as Naive Bayes (NB), Support Vector Machine (SVM), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN). It indicates that the proposed detection model is promising and can be applied effectively to phishing detection.

机译：网络钓鱼是一种网络攻击，旨在通过诱使幼稚的在线用户泄露敏感信息来对其进行攻击。迄今为止，提出了许多反网络钓鱼解决方案，例如黑名单或白名单，基于启发式和基于机器学习的方法。但是，在线用户仍被诱骗在网络钓鱼网站中泄露敏感信息。在本文中，我们基于从URL提取的功能，HTML的源代码以及代表网络钓鱼网页基本特征的第三方服务，提出了一种新颖的网络钓鱼网页检测模型，该模型使用了一种深度学习方法-Stacked自动编码器（SAE），用于检测网络钓鱼网页。为了使特征具有相同的数量级，采用了三种标准化方法。特别提出了一种计算SAE权重矩阵之间的相关系数的方法，以确定最优的隐藏层宽度，具有较高的计算效率和可行性。基于对一组网页仿冒和良性网页的测试，与其他算法（如朴素贝叶斯（NB），支持向量机（SVM），卷积神经网络（CNN）和递归算法）相比，使用SAE的模型具有最佳性能神经网络（RNN）。这表明所提出的检测模型是有前途的，可以有效地应用于网络钓鱼检测。

著录项

来源
《Journal of Computing and Information Technology》 |2019年第2期|41-54|共14页
作者
Jian Feng; Lianyang Zou; Tianzhu Nan;
展开▼
作者单位

College of Computer Science and Technology Xi'an University of Science and Technology Xi'an China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
phishing; deep learning; correlation coefficient;

机译：网络钓鱼深度学习相关系数;
入库时间 2022-08-18 05:11:40

相似文献

外文文献
中文文献
专利

1. Phishing Websites Detection based on Phishing Characteristics in the Webpage Source Code [J] . Mona Ghotaish Alkhozae, Omar Abdullah Batarfi International Journal of Information and Communication Technology Research . 2011,第6期

机译：基于网页中网络钓鱼特征的网络钓鱼网站检测
2. A stacking model using URL and HTML features for phishing webpage detection [J] . Li Yukun, Yang Zhenguo, Chen Xu, Future generation computer systems . 2019,第MAY期

机译：使用URL和HTML功能进行网络钓鱼网页检测的堆叠模型
3. Primary user detection in cognitive radio using spectral-correlation features and stacked denoising autoencoders based on signal classification [J] . Hang LIU, Xu ZHU, Takeo FUJII 電子情報通信学会技術研究報告. スマート無線. Smart Radio . 2017,第56期

机译：使用频谱相关特征和基于信号分类的频谱相关特征和堆叠去噪自动化器的主用户检测
4. Triplet Mining-based Phishing Webpage Detection [C] . Kalana Abeywardena, Jiawei Zhao, Lexi Brent, IEEE Conference on Local Computer Networks . 2020

机译：基于三联挖掘的网络钓鱼网页检测
5. Feature fusion models via stacked autoencoders: Applications to vehicular traffic flow prediction and Alzheimer's disease stage detection [D] . Moussavi-Khalkhali, Arezou. 2016

机译：通过堆叠式自动编码器的特征融合模型：在交通流量预测和阿尔茨海默氏病阶段检测中的应用
6. Detection of preterm birth in electrohysterogram signals based on wavelet transform and stacked sparse autoencoder [O] . Lili Chen, Yaru Hao, Xue Hu 2012

机译：基于小波变换和堆积式稀疏自动编码器的子宫电图信号早产检测
7. Soft Sensor Modeling Method by Maximizing Output-Related Variable Characteristics Based on a Stacked Autoencoder and Maximal Information Coefficients [O] . Yanzhen Wang, Xuefeng Yan 2019

机译：基于堆叠的AutoEncoder和最大信息系数的输出相关变量特征来最大化输出相关变量特征的软传感器建模方法

A Phishing Webpage Detection Method Based on Stacked Autoencoder and Correlation Coefficients

摘要

著录项

相似文献

相关主题

期刊订阅