Detecting phishing e-mails using text and data mining

机译：使用文本和数据挖掘来检测网络钓鱼电子邮件

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents text and data mining in tandem to detect the phishing email. The study employs Multilayer Perceptron (MLP), Decision Trees (DT), Support Vector Machine (SVM), Group Method of Data Handling (GMDH), Probabilistic Neural Net (PNN), Genetic Programming (GP) and Logistic Regression (LR) for classification. A dataset of 2500 phishing and non phishing emails is analyzed after extracting 23 keywords from the email bodies using text mining from the original dataset. Further, we selected 12 most important features using t-statistic based feature selection. Here, we did not find statistically significant difference in sensitivity as indicated by t-test at 1% level of significance, both with and without feature selection across all techniques except PNN. Since, the GP and DT are not statistically significantly different either with or without feature selection at 1% level of significance, DT should be preferred because it yields ‘if-then’ rules, thereby increasing the comprehensibility of the system.

机译：本文提出了串联文本和数据挖掘以检测网络钓鱼电子邮件的方法。该研究采用了多层感知器（MLP），决策树（DT），支持向量机（SVM），数据处理组方法（GMDH），概率神经网络（PNN），遗传编程（GP）和逻辑回归（LR）分类。在使用原始数据集中的文本挖掘从电子邮件正文中提取23个关键字之后，分析了2500个网络钓鱼和非网络钓鱼电子邮件的数据集。此外，我们使用基于t统计的特征选择选择了12个最重要的特征。在这里，我们没有发现在1％显着性水平上的t检验表明，无论有无特征选择，除PNN之外，所有技术都没有灵敏度的统计学差异。由于无论是否选择特征显着性水平，GP和DT在统计上均无显着差异，因此应首选DT，因为DT会产生“ if-then”规则，从而提高了系统的可理解性。

著录项

来源
《2012 IEEE International Conference on Computational Intelligence amp; Computing Research》|2012年|p.1-6|共6页
会议地点 Coimbatore(IN)
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类计算机的应用;
关键词
Classification; Decision Tree; Genetic Programming; Group Method Of Data Handling; Logistic regression; Multilayer Perceptron; Phishing webpage; Probabilistic Neural Network; Support Vector Machine; Text mining;

机译：分类;决策树;遗传程序设计;数据处理的分组方法;逻辑回归;多层感知器;网络钓鱼网页;概率神经网络;支持向量机;文本挖掘;;

相似文献

外文文献
中文文献
专利

1. What Phishing E-mails Reveal: An Exploratory Analysis of Phishing Attempts Using Text Analysis [J] . Daniel E. OLeary Journal of information systems . 2019,第3期

机译：网络钓鱼电子邮件揭示了什么：使用文本分析的网络钓鱼尝试的探索性分析
2. Soft computing based imputation and hybrid data and text mining: The case of predicting the severity of phishing alerts [J] . Kancherla Jonah Nishanth, Vadlamani Ravi, Narravula Ankaiah, Expert systems with applications . 2012,第12期

机译：基于软计算的归因以及混合数据和文本挖掘：预测网络钓鱼警报严重性的情况
3. Detecting and Filtering Immune-Related Adverse Events Signal Based on Text Mining and Observational Health Data Sciences and Informatics Common Data Model: Framework Development Study [J] . Yue Yu, Kathryn Ruddy, Aaron Mansfield, JMIR Medical Informatics . 2020,第6期

机译：基于文本挖掘和观察卫生数据科学和信息学的检测和过滤免疫相关不良事件信号常见数据模型：框架开发研究
4. Detecting phishing e-mails using Text and Data mining [C] . Mayank Pandey, Vadlamani Ravi International Conference on Computational Intelligence and Computing Research . 2012

机译：使用文本和数据挖掘检测网络钓鱼电子邮件
5. A novel data mining methodology for narrative text mining and its application in MSHA accident, injury and illness database. [D] . Yang, Xiaoli. 2011

机译：一种新颖的叙事文本挖掘数据挖掘方法及其在MSHA事故，伤害和疾病数据库中的应用。
6. A text mining approach to detect mentions of protein glycosylation in biomedical text [O] . Daksha Shukla, Valadi K Jayaraman 2012

机译：一种文本挖掘方法用于检测生物医学文本中蛋白质糖基化的提及
7. Phishing website detection using intelligent data mining techniques. Design and development of an intelligent association classification mining fuzzy based scheme for phishing website detection with an emphasis on E-banking. [O] . Abur-rous Maher Ragheb Mohammed 2010

机译：使用智能数据挖掘技术的网络钓鱼网站检测。一种基于智能关联分类挖掘模糊的网络钓鱼网站检测方案的设计与开发，重点是电子银行。

Detecting phishing e-mails using text and data mining

摘要

著录项

相似文献

相关主题

期刊订阅