首页> 外文会议>International Conference on Informatics and Computing >Comparison of Multinomial Na?ve Bayes with K-Nearest Neighbors, Support Vector Machine and Random Forest for Classification of “Network Attacks” Document
【24h】

Comparison of Multinomial Na?ve Bayes with K-Nearest Neighbors, Support Vector Machine and Random Forest for Classification of “Network Attacks” Document

机译:与K-CORMATE邻居的多项式NAαve Bayes的比较,支持向量机和随机林进行“网络攻击”文档的分类

获取原文

摘要

The objective of this paper is to categorize English documents with the topic “Network Attack” using Multinomial Na?ve Bayes method and. It then compares with K-Nearest Neighbors (KNN), Support Vector Machine Linear (SVM Linear) and Random Forest. The classification process was conducted using some feature extraction methods, such as Term Frequency-Inverse Document Frequency (TF-IDF) extraction, Count Vector, and Document Vector (Doc2vec). The experimental result showed that MNB with TF-IDF got an accuracy of 76.00%. The TF-IDF with KNN method, SVM Linear, Random Forest results from efficiency 72.66%, 78.66% and 81.66% respectively, and using Count Vector were 60.00%, 77.00%, 70.66% and 81.00% (MNB, KNN, SVM Linear, Random Forest). The experimental was also conducted using the Random Forest method (as the classifier) and Document Vector (as the feature extraction method). Thus it is obtained the accuracy of 63.33%. The MNB method was quite better to classify the document than KNN method. However, SVM and Random Forest methods were better than the MNB and KNN methods. It can be concluded that the use of TF-IDF was generally better than using Count Vector and Doc2vec. However, the Count Vector had better result compared to TF-IDF under MNB Classifies.
机译:本文的目的是使用多项式Na ve贝雷斯方法和讨论“网络攻击”主题“网络攻击”的英文文件。然后,与K-CORMALT邻居(KNN)进行比较,支持向量机线性(SVM线性)和随机林。使用一些特征提取方法进行分类过程,例如术语频率逆文档频率(TF-IDF)提取,计数矢量和文档向量(DOC2VEC)。实验结果表明,具有TF-IDF的MNB精度为76.00%。具有KNN方法的TF-IDF,SVM线性,随机森林的效率分别产生72.66%,78.66%和81.66%,并且使用计数载体为60.00%,77.00%,70.66%和81.00%(MNB,KNN,SVM线性,随机森林)。还使用随机森林方法(作为分类器)和文件向量进行实验性(作为特征提取方法)。因此,获得63.33%的准确性。 MNB方法比knn方法更好地分类文件。然而,SVM和随机森林方法优于MNB和KNN方法。可以得出结论,TF-IDF的使用通常比使用计数矢量和DOC2VEC更好。然而,与MNB分类下的TF-IDF相比,计数载体具有更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号