首页> 外文会议>International Symposium on Computing and Networking Workshops >On the Effectiveness of Extracting Important Words from Proxy Logs
【24h】

On the Effectiveness of Extracting Important Words from Proxy Logs

机译:关于从代理日志提取重要词语的有效性

获取原文
获取外文期刊封面目录资料

摘要

Modern http-based malware imitates benign traffic to evade detection. To detect unseen malicious traffic, many methods using machine learning techniques have been proposed. These methods took advantage of the characteristic of malicious traffic, and usually require additional parameters which are not obtained from essential security devices such as a proxy server or IDS (Intrusion Detection System). Thus, most previous methods are not applicable to actual information systems. To tackle a realistic threat, a linguistic-based detection method for proxy logs has been proposed. This method extracts words as feature vectors automatically with natural language techniques, and discriminates between benign traffic and malicious traffic. The previous method generates a corpus from the whole extracted words which contain trivial words. To generate discriminative feature representation, a corpus has to be effectively summarized. This paper extracts important words from proxy logs to summarize the corpus. To define the word importance score, this paper uses term frequency and document frequency. Our method summarizes the corpus and improves the detection rate. We conducted cross-validation and timeline analysis with captured pcap files from Exploit Kit (EK) between 2014 and 2016. The experimental result shows that our method improves the accuracy. The best F-measure achieves 1.00 in the cross-validation and timeline analysis.
机译:现代基于HTTP的恶意软件模仿良好的流量来逃避检测。为了检测看不见的恶意流量,已经提出了许多使用机器学习技术的方法。这些方法利用了恶意流量的特征,并且通常需要从基本安全设备(例如代理服务器或ID)(入侵检测系统)获得的附加参数。因此,最先前的方法不适用于实际信息系统。为了解决现实的威胁,已经提出了一种基于语言的代理日志的检测方法。此方法以自然语言技术自动提取单词作为特征向量,并在良性流量和恶意流量之间辨别。以前的方法从包含微型单词的整个提取的单词生成一个语料库。为了产生判别特征表示,必须有效地汇总语料库。本文从代理日志中提取重要的单词来总结语料库。要定义一个重要评分,本文使用术语频率和文档频率。我们的方法总结了语料库并提高了检测率。我们在2014年和2016年之间进行了来自Exploit Kit(EK)的捕获的PCAP文件的交叉验证和时间表分析。实验结果表明,我们的方法提高了准确性。最好的F测量在交叉验证和时间线分析中实现1.00。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号