首页> 外文会议>International Conference for Emerging Technology >Detecting Malware, Malicious URLs and Virus Using Machine Learning and Signature Matching
【24h】

Detecting Malware, Malicious URLs and Virus Using Machine Learning and Signature Matching

机译:使用机器学习和签名匹配检测恶意软件,恶意URL和病毒

获取原文

摘要

Nowadays most of our data is stored on an electronic device. The risk of that device getting infected by Viruses, Malware, Worms, Trojan, Ransomware, or any unwanted invader has increased a lot these days. This is mainly because of easy access to the internet. Viruses and malware have evolved over time so identification of these files has become difficult. Not only by viruses and malware your device can be attacked by a click on forged URLs. Our proposed solution for this problem uses machine learning techniques and signature matching techniques. The main aim of our solution is to identify the malicious programs/URLs and act upon them. The core idea in identifying the malware is selecting the key features from the Portable Executable file headers using these features we trained a random forest model. This RF model will be used for scanning a file and determining if that file is malicious or not. For identification of the virus, we are using the signature matching technique which is used to match the MD5 hash of the file with the virus signature database containing the MD5 hash of the identified viruses and their families. To distinguish between benign and illegitimate URLs there is a logistic regression model used. The regression model uses a tokenizer for feature extraction from the URL that is to be classified. The tokenizer separates all the domains, sub-domains and separates the URLs on every ‘/’. Then a TfidfVectorizer (Term Frequency – Inverse Document Frequency) is used to convert the text into a weighted value. These values are used to predict if the URL is safe to visit or not. On the integration of all three modules, the final application will provide full system protection against malicious software.
机译:如今我们大多数数据都存储在电子设备上。这些日子这些设备感染了病毒,恶意软件,蠕虫,木马,赎金软件或任何不需要的入侵者的风险已经增加了很多。这主要是因为易于访问互联网。病毒和恶意软件随着时间的推移而发展,因此识别这些文件已经变得困难。不仅由病毒和恶意软件,您的设备可以通过点击伪造的URL攻击。我们提出的此问题的解决方案使用机器学习技术和签名匹配技术。我们解决方案的主要目标是识别恶意计划/网址并采取行动。识别恶意软件的核心想法是使用这些功能从便携式可执行文件标头中选择关键功能,我们培训了随机林模型。该RF模型将用于扫描文件并确定该文件是否是恶意的。为了识别病毒,我们使用的是签名匹配技术,该技术用于将文件的MD5散列与包含所识别的病毒及其家庭的MD5散列的病毒签名数据库匹配。区分良性和非法URL,使用了一种逻辑回归模型。回归模型使用要分类的URL的特征提取的令致授权。销售器将所有域,子域分开并分隔每个“/”分隔URL。然后,TFIDFvectorizer(术语频率 - 逆文档频率)用于将文本转换为加权值。这些值用于预测URL是否安全访问。在整合所有三个模块的情况下,最终应用程序将为恶意软件提供完整的系统保护。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号