A Machine-learning based Unbiased Phishing Detection Approach

机译：一种基于机器学习的无偏见的网络钓鱼检测方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Phishing websites mimic a legitimate website to capture sensitive information of users. Machine learning is often used to detect phishing websites. In current machine-learning based approaches, the phishing and the genuine sites are classified into two groups based on some features. We feel that this is an inadequate modeling of the problem as the characteristics of different phishing websites may vary widely. Moreover, the current approaches are biased towards groups of over-represented samples. Most importantly, as new features are exploited, the training set must be updated to detect new phishing sites. There is a time lag between the evolution of new phishing sites and retraining of the model, which can be exploited by attackers. We provide an alternative approach that aims to solve the above-mentioned problems. Instead of finding commonalities among non-related genuine websites, we find similarity of a suspicious website to a legitimate target and use machine learning to decide whether the suspicious site is impersonating the target. We define the fingerprint of a legitimate website by using visual and textual characteristics against which a sample is compared to ascertain whether it is fake. We implemented our approach on 14 legitimate websites and tested against 1446 unique samples. Our model reported an accuracy of at least 98% and it is not biased towards any website. This is in contrast to the current machine learning models that may be biased towards groups of over-represented samples and lead to more false-negative errors for less popular websites.

机译：网络钓鱼网站模仿合法的网站以捕获用户的敏感信息。机器学习通常用于检测网络钓鱼网站。在当前的基于机器学习的方法中，基于某些功能，网络钓鱼和真正的网站分为两组。我们认为，由于不同网络钓鱼网站的特征可能会随着广泛而变化，这是对问题的不足。此外，目前方法偏向于过度代表的样本组。最重要的是，随着新功能的利用，必须更新培训集以检测新的网络钓鱼站点。新网络钓鱼站点的演变与模型的再培训之间存在时间延续，这可以被攻击者利用。我们提供一种旨在解决上述问题的替代方法。而不是在非相关的真正网站中寻找共性，我们发现可疑网站的相似性并使用机器学习来决定可疑的网站是否冒充目标。我们通过使用视觉和文本特征来定义合法网站的指纹，以确定样本是否确定它是假的。我们在14个合法网站上实施了我们的方法，并针对1446个独特的样本测试。我们的型号报告了至少98％的准确性，它并不偏向任何网站。这与当前机器学习模型相反，可能偏向于多个超代表样本的组，并导致更不流行的网站的虚假负误差。

著录项

来源
《International Conference on Security and Cryptography》|2020年|652p|共8页
会议地点
作者
Hossein Shirazi; Landon Zweigle; Indrakshi Ray;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP309-53;
关键词
Phishing; Social engineering; Fingerprint;

机译：网络钓鱼;社会工程;指纹;

相似文献

外文文献
中文文献
专利

1. An improved ELM-based and data preprocessing integrated approach for phishing detection considering comprehensive features [J] . Yang Liqun, Zhang Jiawei, Wang Xiaozhe, Expert systems with applications . 2021,第Mara期

机译：考虑综合特征，改进的基于榆树和数据预处理综合方法，用于网络钓鱼检测
2. Phishing website detection based on effective machine learning approach [J] . Gururaj Harinahalli Lokesh, Goutham BoreGowda Journal of Cyber Security Technology . 2021,第1期

机译：基于有效机器学习方法的网络钓鱼网站检测
3. A novel approach for phishing URLs detection using lexical based machine learning in a real-time environment [J] . Gupta Brij B., Yadav Krishna, Razzak Imran, Computer Communications . 2021,第Jula期

机译：基于词法基于基于机器学习的网络钓鱼URL检测的新方法
4. A Machine-learning based Unbiased Phishing Detection Approach [C] . Hossein Shirazi, Landon Zweigle, Indrakshi Ray International Conference on Security and Cryptography . 2020

机译：一种基于机器学习的无偏见的网络钓鱼检测方法
5. Unbiased Phishing Detection Using Domain Name Based Features [D] . Shirazi, Hossein. 2018

机译：使用基于域名的功能进行公正的网络钓鱼检测
6. Image-based crystal detection: a machine-learning approach [O] . Roy Liu, Yoav Freund, Glen Spraggon -1

机译：基于图像的晶体检测：一种机器学习方法
7. Phishing website detection using intelligent data mining techniques. Design and development of an intelligent association classification mining fuzzy based scheme for phishing website detection with an emphasis on E-banking. [O] . Abur-rous Maher Ragheb Mohammed 2010

机译：使用智能数据挖掘技术的网络钓鱼网站检测。一种基于智能关联分类挖掘模糊的网络钓鱼网站检测方案的设计与开发，重点是电子银行。

A Machine-learning based Unbiased Phishing Detection Approach

摘要

著录项

相似文献

相关主题

期刊订阅