Real life transactional data often poses challenges such as very large size, high dimensionality, skewed distribution, sparsity, seasonal variations and market-drift or migration [1, 2]. Most studies have taken a static view of the data while making predictions about a customer's buying behavior, market segmentation, etc. [3, 4]. A notable exception is recent work on temporal association rule mining, dealing with incremental characteristics and change, for example, see [5, 6]. This paper focusses on the problem of segmenting customers visiting a rapidly growing e-tailer. The segments are dynamic and seasonal, so preprocessing and trend characterization is key. We use a real-life data belonging to an e-commerce business and referred to as Horizon data in this paper, provided by KD1 (since then acquired by Net Perceptions) to illustrate the issues. In Section 2, the Horizon data is summarized. Section 3 quantifies market migration for choosing the appropriate period of data. Based on seasonal variations in purchasing behavior, a novel seasonality detection and partitioning scheme is described. Some of the market migration and oscillation results on Horizon data are also presented. Section 4 describes a new concept called Cluster Space for converting this high dimensional (> 10, 000) data into a continuous low dimensional space using a graph based clustering called VBACC [7] on the seasonally partitioned data. Motion detection and visualization schemes are introduced, and some interesting trends found in the Horizon data are described. A note on Market vs. Customer Migration: For our discussion we define market migration as a non-periodic change in the product purchase distribution for all the customers. Customer migration is another trend in which the purchase profile of a customer changes with time and may or may not be periodic over long periods. It is important to note that although a customer might migrate to a new set of products with time, new customers might replace him. Thus, it is possible to have substantial customer migration without corresponding market migration. A model is meaningful only for the period for which the market profile is reasonably stable, i.e the market migration is not substantial. In such a period it is useful to look at customer migration since the customer migration often happens faster than market migration.
展开▼