The ability to process, analyze, and evaluate realtime data and to identify their anomaly patterns is in response to realized increasing demands in various networking domains, such as corporations or academic networks. The challenge of developing a scalable, fault-tolerant and resilient monitoring system that can handle data in real-time and at a massive scale is nontrivial. We present a novel framework for real time network traffic anomaly detection using machine learning algorithms. The proposed prototype system uses existing big data processing frameworks such as Apache Hadoop, Apache Kafka, and Apache Storm in conjunction with machine learning techniques and tools. Our approach consists of a system for real-time processing and analysis of the real-time network-flow data collected from the campus-wide network at the University of Missouri-Kansas City. Furthermore, the network anomaly patterns were identified and evaluated using machine learning techniques. We present preliminary results on anomaly detection with the campus network data.
展开▼