首页> 外文期刊>Research journal of applied science, engineering and technology >Spatial Clustering Algorithm for Time Series Rainfall Data Using X-Means Data Splitting
【24h】

Spatial Clustering Algorithm for Time Series Rainfall Data Using X-Means Data Splitting

机译:使用X均值数据分裂的时间序列降雨数据空间聚类算法

获取原文
获取外文期刊封面目录资料

摘要

The aim of this study is to present a new spatial clustering process for time series data. It has become an important and demanding application when the data involves chronological long time series and huge datasets. A great challenge in clustering is to achieve an optimal solution in searching similarity along the series. Furthermore, it also involves a very large-scale data analysis. Unfortunately, the existing clustering time series algorithms have become impractical since data do not scale properly for longer time series. The performance of the clustering algorithm gets even worse if it relies on actual data and many clustering algorithms are often faced with conflict in handling high dimensional data. In the case of spatial time series, the problem can be solved by unsupervised approaches rather than supervised classification, with appropriate preprocessing techniques to transform the actual data. The unsupervised solution using time series clustering algorithms is capable to extract valuable information and identify structure in complex and massive datasets as spatial time series. Therefore, a clustering algorithm by introducing data transformation using X-means data splitting is proposed to investigate the spatial homogeneity of time series rainfall data. The hierarchical clustering was used to demonstrate the similarity once the data was divided into training and testing sets. The proposed algorithm is compared with five types of data transformation techniques, namely mean and median in monthly data and the rest is in daily data such as binary, cumulative and actual values. Results indicate that data transformation using X-means data splitting in hierarchical clustering outperformed other transformation techniques and more consistent between training and testing datasets based on similarity measures.
机译:这项研究的目的是为时间序列数据提供一种新的空间聚类过程。当数据涉及按时间顺序的长时间序列和庞大的数据集时,它已成为重要而苛刻的应用。聚类中的一大挑战是在搜索序列相似性时获得最佳解决方案。此外,它还涉及非常大规模的数据分析。不幸的是,由于数据无法在较长的时间序列中正确缩放,因此现有的聚类时间序列算法变得不切实际。如果聚类算法依赖于实际数据,则其性能会变得更差,并且许多聚类算法在处理高维数据时经常面临冲突。在空间时间序列的情况下,可以通过无监督的方法而不是有监督的分类来解决问题,并使用适当的预处理技术来转换实际数据。使用时间序列聚类算法的无监督解决方案能够提取有价值的信息,并将复杂和大量数据集中的结构识别为空间时间序列。因此,提出了一种引入X均值数据分裂的数据变换聚类算法,以研究时间序列降雨数据的空间均匀性。一旦将数据划分为训练集和测试集,就使用分层聚类来证明相似性。将该算法与五种类型的数据转换技术进行了比较,分别是月度数据中的均值和中位数,其余是二进制,累积和实际值之类的日常数据。结果表明,在分层聚类中使用X均值数据拆分的数据转换性能优于其他转换技术,并且基于相似性度量的训练和测试数据集之间的一致性更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号