首页> 外文期刊>Journal of Theoretical and Applied Information Technology >CLASSIFICATION MODEL BASED ON URL AND CONTENT FEATURE APPROACH FOR DETECTION PHISHING WEBSITE IN INDONESIA
【24h】

CLASSIFICATION MODEL BASED ON URL AND CONTENT FEATURE APPROACH FOR DETECTION PHISHING WEBSITE IN INDONESIA

机译:基于URL和内容特征方法的印尼钓鱼网站分类模型

获取原文
           

摘要

This research proposed a classification model that can be used to detect phishing website accurately. This study takes a case study from Indonesia because data used are sites using Bahasa Indonesia, hosted in Indonesia and frequently accessed by Internet users from Indonesia. Dataset used in this research consisted of approximately 102 authentic websites and 364 phishing websites. The proposed detection technique based on website analysis using the URL and content feature based approach. This classification model combines several heterogeneous features from previous research and proposes new URL and content feature based approach that are expected to improve detection performance when compared with previous research. Moreover, in the proposed classification model created a web crawler to extract feature vectors in this research. This research uses four different algorithms such as Sequential Minimal Optimization (SMO), Naive Bayes, Bagging and Multilayer Perceptron. The result, SMO, Naive Bayes, Bagging and Multilayer Perceptron have accuracy of approximately 89.27%, 93.78%, 95.49% and 92.70%. Algorithm has the best accuracy is Bagging, it will be used in this classification model to compare with classification model in previous research using same dataset. The result, accuracy of classification model in this research outperformed accuracy of classification model in previous research. The classification model in this research outperform 5.79% against classification model in previous research which only yielded 89.70% accuracy.
机译:这项研究提出了一种分类模型,可用于准确检测网络钓鱼网站。这项研究以印度尼西亚为例,因为所使用的数据是使用印度尼西亚语的网站,这些网站托管在印度尼西亚,并经常被印度尼西亚的互联网用户访问。本研究中使用的数据集包括大约102个真实网站和364个网络钓鱼网站。所提出的基于网站分析的检测技术采用了基于URL和内容特征的方法。该分类模型结合了先前研究的几种异构特征,并提出了基于URL和内容特征的新方法,与以前的研究相比,有望提高检测性能。此外,在本研究中,在提出的分类模型中创建了一个网络爬虫以提取特征向量。这项研究使用了四种不同的算法,例如顺序最小优化(SMO),朴素贝叶斯,装袋和多层感知器。结果,SMO,朴素贝叶斯,袋装和多层感知器具有约89.27%,93.78%,95.49%和92.70%的精度。具有最佳准确度的算法是Bagging,将在该分类模型中使用该算法与以前使用相同数据集的研究中的分类模型进行比较。结果表明,本研究中分类模型的准确性优于先前研究中的分类模型。本研究中的分类模型比以前的研究中的分类模型优越5.79%,准确率仅为89.70%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号