首页> 外文会议> >Order-preserving clustering and its application to gene expression data
【24h】

Order-preserving clustering and its application to gene expression data

机译:保序聚类及其在基因表达数据中的应用

获取原文

摘要

Clustering of ordered data sets is a common problem faced in many pattern recognition tasks. Existing clustering methods either fail to capture the data or use restrictive models such as HMMs or AR models to model the data. In this paper, we present a general order-preserving clustering algorithm that allows arbitrary patterns of data evolution by representing each ordered set as a curve. Clustering of the data then reduces to grouping curves based on shape similarity. We develop a novel measure of shape similarity between curves using scale-space distance. Shape similarity or dis-similarity is judged by composing the higher-dimensional curves from constituent curves and noting the additional twists and turns in such curves that can be attributed to shape differences. An algorithm analogous to K-means clustering is then developed that uses prototypical curves for cluster representation. Results are demonstrated on the ordered gene expression data sets obtained from gene chips.
机译:有序数据集的聚类是许多模式识别任务中面临的常见问题。现有的群集方法要么无法捕获数据,要么使用限制性模型(例如HMM或AR模型)对数据进行建模。在本文中,我们提出了一种通用的保序聚类算法,该算法通过将每个有序集表示为一条曲线来允许数据演化的任意模式。然后,数据的聚类可以简化为基于形状相似度的分组曲线。我们使用比例空间距离开发了一种曲线之间形状相似性的新颖度量。形状相似性或不相似性是通过从组成曲线组成高维曲线,并注意这些曲线中可能归因于形状差异的附加扭曲和转折来判断的。然后,开发了一种类似于K-均值聚类的算法,该算法使用原型曲线进行聚类表示。结果显示在从基因芯片获得的有序基因表达数据集上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号