现有的关于多变量时间序列聚类的研究中所研究的变量规模均较少,而现实生活又经常会出现大规模多变量时间序列,因此提出了LS-Cluster算法,旨在对有上万变量的大规模多变量时间序列进行聚类.首先,将每个时刻的多变量时间序列转化成矩形网格,然后使用二维离散余弦变换对其进行特征提取.接着提出了LS相似度用于计算特征序列之间的相似程度.最后,采用层次聚类方法发现其中所蕴含的模式.实验结果显示,该方法在人工合成数据和真实数据上都有较好的效果和可扩展性.%In the existing studies on multivariate time series clustering, the size of the variables studied is small, and in real life, large scale multivariate time series often appear.Therefore, LZ-Cluster algorithm is proposed, which aims at clustering large scale multivariate time series with tens of thousands of variables.Firstly, the multivariate time series of each time is transformed into a rectangle grid, and then two-dimensional discrete cosine transform is used to extract features.LZ similarity is proposed to calculate the degree of similarity between feature series.Finally, hierarchical clustering method is used to discover the patterns.The experimental results show that the proposed method has good performance and extensibility in both synthetic data and real data.
展开▼