The IT equipment and the applications running on them become more and more complicated in securities companies. This leads to the need of a monitoring and analyzing system to help the IT staff to quickly analyze the log files generated together with the daily production to locate fault point or even avoid the crash of the application. Based on the requirements analysis of a securities company,we build a Hadoop and openTSDB based prototype IT equipment monitoring and analyzing system,in which Hadoop deals with the distributed need of the company,while openTSDB is used for the most popular data type in securities companies,i.e.,time series data storage. Machine-Learning (ML) forecasting and anomaly detection algorithms were employed to monitor the usage of system components,prediction and detection of performance anomalies besides their causes and mining association between applications and the anomalies reported based on application log files. Experimental results have shown the efficiency of the proposed framework as regards to prediction and detection of performance anomalies. Furthermore,we proposed an optimization algorithm that can learn the best combination of long term and short term influence based on the input time series history. A separate application was developed to directly discover the applications or processes,which could have been the possible sources of the anomalies by comparing times between the application log files and the output of the algorithm. Experiments done in our lab environment has shown that it yielded astonishing results.
展开▼