首页> 外文期刊>Knowledge-Based Systems >ChronoClust: Density-based clustering and cluster tracking in high-dimensional time-series data
【24h】

ChronoClust: Density-based clustering and cluster tracking in high-dimensional time-series data

机译:Chronoclust:高维时间序列数据中基于密度的聚类和群集跟踪

获取原文
获取原文并翻译 | 示例

摘要

In many scientific disciplines, the advent of new high-throughput technologies is giving rise to vast quantities of high-dimensional time-series data. A common requirement is to identify clusters of data-points with similar characteristics in this experimental data, and track their development over time. In this article we present ChronoClust, a novel density-based clustering algorithm for processing a time-series of discrete datasets, generating arbitrarily shaped clusters, and explicitly tracking their temporal evolution. We provide a conceptualisation of ChronoClust's parameters, and guidelines for selecting their values. The development of ChronoClust was motivated by the need to characterise the immune response to disease. As such, we demonstrate and evaluate ChronoClust's operation on two immune-related datasets: (1) a synthetic dataset exhibiting the temporal evolution qualities of the immune response as they would be observed through mass cytometry, a cutting edge high-throughput technology, and (2) a Flow cytometry dataset capturing the immune response in West Nile Virus (WNV)-infected mice. Our comprehensive qualitative and quantitative analyses confirm ChronoClust's suitability for this type of problem: the temporal relationships engineered into the synthetic dataset are successfully recovered, and the cell populations and dynamics unveiled in the WNV dataset match those identified through a domain expert. ChronoClust is applicable beyond Immunology, and we provide an open source Python implementation to support its adoption more widely. We additionally make our two datasets publicly available to promote reproducible research and third-party work on temporal clustering and cluster tracking. (C) 2019 Elsevier B.V. All rights reserved.
机译:在许多科学学科中,新的高吞吐量技术的出现正在产生大量的高维时间序列数据。共同要求是识别该实验数据中具有相似特性的数据点集群,并随着时间的推移跟踪它们的开发。在本文中,我们呈现Chronoclust,一种基于新的基于密度的聚类算法,用于处理分隔数据集的时间序列,产生任意形状的集群,并明确跟踪它们的时间演进。我们提供了Chronoclust参数的概念化,以及选择其值的指南。计时的发展是有必要表征对疾病的免疫反应的动机。因此,我们证明并评估了两种免疫相关数据集上的时间顺序操作:(1)一种合成数据集,其表现出免疫应答的时间换气质量,因为它们将通过质量细胞测定法,切削刃高通量技术和( 2)流式细胞术数据集捕获西尼罗河病毒(WNV)的免疫反应 - 培养的小鼠。我们的全面的定性和定量分析确认了对这种问题的计时的适用性:在WNV数据集中成功恢复了工程到合成数据集的时间关系,并且在WNV数据集中揭开了通过域专家识别的细胞群和动态。 Chronoclust适用于免疫学超出,我们提供了一个开源Python实施,以便更广泛地支持其采用。我们另外,我们的两个数据集可公开可用于促进在时间聚类和集群跟踪的可重复研究和第三方工作。 (c)2019 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号