The frequent items problem is to process a stream as a stream of items and find all items occurring more than a given fraction of the time. It is one of the most heavily studied problems in data stream mining, dating back to the 1980s. Aiming at higher false positive rate of the Space-Saving algorithm, an LRU-based (Least Recently Used, LRU) improved algorithm with low frequency item pre-eliminated is proposed. Accuracy, stability and adaptability of the improved algorithm have been apparently enhanced. Experimental results indicate that the algorithm can not only be used to find the frequent items, and can be used to estimate the frequency of them precisely. The improved algorithm can be used for online processing both high-speed network packet stream and backbone NetFlow stream.
展开▼