基于URL文本特征及链接关系的钓鱼网站识别算法

赵蹲宇; 张兆心

首页> 中文期刊>高技术通讯 >基于URL文本特征及链接关系的钓鱼网站识别算法

基于URL文本特征及链接关系的钓鱼网站识别算法

开具论文收录证明 >>

期刊封面封底目录下载 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

为了提高对钓鱼网站的识别准确率,通过对钓鱼网站统一资源定位符(URL)文本数据的分析,结合钓鱼网站内部链接关系组成的网络拓扑结构特征,提出了基于URL文本特征及链接关系的钓鱼网站识别算法FAUFL.该算法的原理是:以URL文本特征作为输入,采用随机森林算法生成基于URL文本特征的钓鱼网站判别算法;以链接关系作为输入构建相关网页群,采用基于最大流切割的相关网页群算法生成基于链接关系的钓鱼网站判别算法;将上述两种判别算法结果作为输入,采用Bagging算法进行进一步评估.测试结果表明钓鱼网站识别算法FAUFL算法的识别准确率为99.2%,比基于URL文本特征的算法的准确率提高3.9%,比基于链接关系的算法提高5.0%.%Based on the analysis of the uniform resource location ( URL) text data of fishing sites and the characteristics of the network topology composed of fishing websites, a fishing site recognition algorithm based on URL text features and link relation ( FAUFL) is proposed to improve the accuracy rate of fishing site recognition.The principle of the algorithm is as below:By using URL text features as input, the random forest algorithm is used to generate the fish-ing site discrimination algorithm based on URL text features.The related web page group is constructed by using the link relation as input, and the related web page algorithm based on the maximum flow cutting is used to gener-ate the fishing website based on the link discriminant algorithm.By taking the above two kinds of discriminant algo-rithms' results as input, the further evaluation is conducted by using the Bagging algorithm.The test results show that the accuracy rate of the FAUFL is 99.2%, which is 3.9% higher than that of the URL text feature-based algo-rithm, and 5.0% higher than that of the link-based algorithm.

著录项

来源
《高技术通讯》|2017年第8期|708-717|共10页
作者
赵蹲宇; 张兆心;
展开▼
作者单位

哈尔滨工业大学计算机科学与技术学院哈尔滨150001;

哈尔滨工业大学计算机科学与技术学院哈尔滨150001;

展开▼
原文格式 PDF
正文语种 chi
中图分类
关键词
钓鱼网站; 融合算法; 统一资源定位符(URL); 文本特征; 链接关系;

相似文献

中文文献
外文文献
专利

1. 基于数据挖掘的钓鱼网站URL预测研究 [J] . 陈宇飞 . 电子制作 . 2019,第008期
2. 基于URL语言特征的钓鱼网站检测算法 [J] . 王雨琪 ,刘博文 ,林果园 . 计算机工程与应用 . 2019,第024期
3. 基于URL特征的钓鱼网站检测方式 [J] . 蔺亚东 . 电子测试 . 2014,第003期
4. 基于异常特征的钓鱼网站URL检测技术 [J] . 黄华军 ,钱亮 ,王耀钧 . 信息网络安全 . 2012,第001期
5. 基于Contourlet变换与LSSVM的玉米种子识别算法 [J] . 魏利峰 ,纪建伟 . 江苏农业科学 . 2016,第002期
6. 一种基于URL模式的分页链接自动获取方法 [C] . LI Gui ,李贵 ,CHEN Cheng . 中国计算机用户协会网络应用分会2013年第十七届网络新技术与应用年会 . 2013
7. 基于网站链接特征的钓鱼网站检测技术研究 [A] . 袁华平 . 2019

基于URL文本特征及链接关系的钓鱼网站识别算法

摘要

著录项

相似文献

相关主题

期刊订阅