To find global frequent itemsets in a multiple, continuous, rapid and time-varying data stream, a fast, incremental, real-time, and little-memory-cost algorithm should be used. Based on the max-frequency window model, a BHS summary structure and a novel algorithm called GGFI-MFW are proposed. It merely updates the summaries for subsets of the data new arrived and could directly generate the max-frequency for a given itemset without scanning the whole summary. Experiment results indicate that the proposed algorithm could efficiently find global frequent itemsets over a data stream with a small memory and perform overwhelming superiority for a large number of distinct items.
展开▼