首页> 外文学位 >A sequential clustering algorithm with applications to gene expression data.

【24h】

A sequential clustering algorithm with applications to gene expression data.

机译：一种顺序聚类算法，适用于基因表达数据。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Microarrays are part of a new class of biotechnologies which allow the monitoring of expression levels for thousands of genes simultaneously. Gene profile data come from experiments that investigate the behavior of genes over several time points. Biologists are interested in these gene expression profiles because it is believed that genes in the same functional pathway have similar profiles of gene expression.; In the analysis of data from microarray experiments, most of the unsupervised learning processes involve three steps: standardization, defining a dissimilarity measure, and applying a clustering algorithm. We will discuss the issues involved in these steps, and we will propose new methods. We will discuss the problems of current clustering algorithms and propose a new algorithm, the sequential clustering algorithm. This algorithm finds clusters sequentially based on a gaussian model. The algorithm does not require the specification of the number of clusters and allows for sporadic objects.; We will discuss a semiparametric mixture model which is motivated by the sequential clustering algorithm. Two estimators for the mixing proportion in semiparametric mixture model are proposed, and their properties are investigated using simulations.; A new dissimilarity measure that takes into account the time order and the time distance between experiments will be introduced. We will discuss the performance of various distances in clustering using the Asymptotic Discriminating Measure (ADM) and show that the new dissimilarity measure has always higher ADM than the Euclidean distance. The comparison of distances in small samples will be also discussed. We will introduce a sequential clustering algorithm with the new dissimilarity measure and investigate its performance.

机译：微阵列是一类新的生物技术的一部分，该技术允许同时监视数千种基因的表达水平。基因概况数据来自研究多个时间点上基因行为的实验。生物学家对这些基因表达谱感兴趣，因为据信在相同功能途径中的基因具有相似的基因表达谱。在分析来自微阵列实验的数据时，大多数无监督学习过程涉及三个步骤：标准化，定义相异性度量和应用聚类算法。我们将讨论这些步骤中涉及的问题，并将提出新的方法。我们将讨论当前聚类算法的问题，并提出一种新的算法，即顺序聚类算法。该算法基于高斯模型顺序查找聚类。该算法不需要指定簇的数量，并且允许出现零星的对象。我们将讨论由顺序聚类算法驱动的半参数混合模型。提出了半参数混合模型中两种混合比例的估计量，并通过仿真研究了它们的性质。将介绍一种考虑时间顺序和实验之间的时间距离的新的差异度度量。我们将使用渐近鉴别测度（ADM）讨论聚类中各种距离的性能，并表明新的相异性测度始终具有比欧几里德距离更高的ADM。还将讨论小样本中距离的比较。我们将介绍一种采用新的相似度度量的顺序聚类算法，并研究其性能。

著录项

作者
Song, Jongwoo.;
展开▼
作者单位

The University of Chicago.;

展开▼
授予单位 The University of Chicago.;
学科 Statistics.
学位 Ph.D.
年度 2003
页码 100 p.
总页数 100
原文格式 PDF
正文语种 eng
中图分类统计学;
关键词

相似文献

外文文献
中文文献
专利

1. KMeans greedy search hybrid algorithm for biclustering gene expression data. [J] . Das S, Idicula SM Advances in Experimental Medicine and Biology . 2010,第Null期

机译：用于对基因表达数据进行聚类的KMeans贪婪搜索混合算法。
2. KMeans greedy search hybrid algorithm for biclustering gene expression data. [J] . Das S, Idicula SM Advances in Experimental Medicine and Biology . 2010,第Null期

机译：用于对基因表达数据进行聚类的KMeans贪婪搜索混合算法。
3. UniBic: Sequential row-based biclustering algorithm for analysis of gene expression data [J] . Zhenjia Wang, Guojun Li, Robert W. Robinson, Scientific reports. . 2016,第1期

机译：徽章：基于顺序的基于行的BICLUSTING算法，用于分析基因表达数据
4. Cluster inference methods and graphical models evaluated on NCI60 microarray gene expression data. [C] . Waddell PJ, Kishino H Workshop on Genome Informatics . 2000

机译：NCI60微阵列基因表达数据评估的群集推理方法和图形模型。
5. K-means clustering with automatic determination of K using a Multiobjective Genetic Algorithm with applications to microarray gene expression data. [D] . Shaw, Matthew Karl Ellis. 2015

机译：使用多目标遗传算法自动确定K值的K均值聚类，并应用于微阵列基因表达数据。
6. Clustering Algorithms: Their Application to Gene Expression Data [O] . Jelili Oyelade, Itunuoluwa Isewon, Funke Oladipupo, 2016

机译：聚类算法：在基因表达数据中的应用
7. Clustering Algorithms: Their Application to Gene Expression Data [O] . Oyelade O. J., Isewon Itunuoluwa, Oladipupo O. O., 2016

机译：聚类算法：在基因表达数据中的应用
8. Scalable Algorithm for Clustering Sequential Data. [R] . Guralnik, V., Karypis, G. 2001

机译：可扩展的序列数据聚类算法。

A sequential clustering algorithm with applications to gene expression data.

摘要

著录项

相似文献

相关主题

期刊订阅