With the growing popularity of the internet today, the enterprises have paid more attention to their informatization construction. The development of enterprise informatization made more and more enterprise connect to the internet, how to protect the information security is one of the critical problems enterprises should consider. This paper took mathmatical statistics and text classification to analyze more than 3 million data from a famouse hacker's forum. The paper got the levels of informaiton security threats for different industries through TF-IDF model and KNN algorithm. From that, it analyzed the reasons why there are so many information security problems in different industries deeply, and provided some targeted suggestions.%在互联网日益普及的今天,企业越来越重视自身的信息化建设,企业信息化的发展使得接入互联网的企业不断增长,如何保护企业信息安全成为企业发展的关键内容之一。本文采取数理统计及文本分类的方法,对来自某知名黑客论坛的300余万条数据进行分析,通过TF-IDF模型与KNN算法分类思想,得出不同行业的网络信息安全威胁程度,并划分出较低、适中以及较高三类等级。在此基础上,根据行业特点深入剖析了不同行业产生信息安全问题的原因,并提出了相应的改进措施和建议。
展开▼