...
首页> 外文期刊>Data mining and knowledge discovery >A framework for dissimilarity-based partitioning clustering of categorical time series
【24h】

A framework for dissimilarity-based partitioning clustering of categorical time series

机译:基于差异的分类时间序列分区聚类的框架

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

A new framework for clustering categorical time series is proposed. In our approach, a dissimilarity-based partitioning method is considered. We suggest measuring the dissimilarity between two categorical time series by assessing both closeness of raw categorical values and proximity between dynamic behaviours. For the latter, a particular index computing the temporal correlation for categorical-valued sequences is introduced. The dissimilarity measure is then used to perform clustering by considering a modified version of the -modes algorithm specifically designed to provide with a better characterization of the clusters. Furthermore, the problem of determining the number of clusters in this framework is analyzed by comparing a range of procedures, including a prediction-based resampling method properly adjusted to deal with our dissimilarity. Several graphical devices to interpret and visualize the temporal pattern of each cluster are also provided. Performance of this clustering methodology is studied on different simulated scenarios and its effectiveness is concluded by comparison with alternative approaches. Real data use is illustrated by analyzing navigation patterns of users visiting a specific news web site.
机译:提出了分类时间序列聚类的新框架。在我们的方法中,考虑了基于差异的分区方法。我们建议通过评估原始分类值的接近度和动态行为之间的接近度来测量两个分类时间序列之间的差异。对于后者,介绍了一种用于计算分类值序列的时间相关性的特定索引。然后,通过考虑专门为提供更好的群集特征而设计的-modes算法的修改版本,将相异性度量用于执行群集。此外,通过比较一系列程序来分析确定该框架中的簇数的问题,其中包括适当调整以应对我们的差异性的基于预测的重采样方法。还提供了几种图形设备来解释和可视化每个群集的时间模式。在不同的模拟场景下研究了这种聚类方法的性能,并通过与替代方法的比较得出了其有效性。通过分析访问特定新闻网站的用户的导航模式来说明实际数据的使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号