首页> 外文会议>International Conference on Computer Science and Information Technologies >Model of the Objective Clustering Inductive Technology of Gene Expression Profiles Based on SOTA and DBSCAN Clustering Algorithms
【24h】

Model of the Objective Clustering Inductive Technology of Gene Expression Profiles Based on SOTA and DBSCAN Clustering Algorithms

机译:基于SOTA和DBSCAN聚类算法的基因表达型材目标聚类诱导技术的模型

获取原文

摘要

The paper presents the hybrid model of the objective clustering inductive technology based on complex using of the self-organizing SOTA and the density DBSCAN clustering algorithms. The inductive methods of complex systems analysis were used as the basis to implement the objective clustering inductive technology of gene expression profiles. To estimate the clustering quality for equal power subsets (include the same quantity of pairwise similar objects) the complex multiplicative criterion was calculated as the combination of the Calinski-Harabasz criterion and WB-index. The external clustering quality criterion is calculated as the normalized difference of the internal clustering quality criteria for the equal power subsets. The final decision concerning the determination of the optimal parameters of the clustering algorithm operation is done based on the maximum value of the Harrington desirability function that takes into account both the character of the objects and the clusters distribution in various clustering and the difference between clustering, which are implemented on the equal power subsets. The studied data grouping within the framework of the objective clustering inductive technology was performed in two stages. Firstly, the studied gene expression profiles were grouped with the use DBSCAN clustering algorithm. Then, the obtained set of gene expression profiles was divided into two clusters using SOTA clustering algorithm. This step-by-step procedure of the data clustering crates the conditions to save more useful information for following data processing.
机译:本文介绍了基于使用自组织SOTA和密度DBSCAN聚类算法的复杂目标集群感应技术的混合模式。复杂的系统的分析的感应方法被用于为基础来实现的基因表达谱的客观聚类感应技术。为了估计相等的功率的子集的聚类质量(包括成对相似对象的相同的量)的复乘法准则计算为Calinski-Harabasz准则和WB-索引的组合。外部聚类质量标准被计算为的内部聚类质量标准的相等的功率的子集的归一化差。关于聚类算法操作的最佳参数的确定的最终决定是基于哈氏可取函数,考虑到聚类之间的对象的两者的性质和在各种聚类簇分布和差的最大值进行,这些都对等功率的子集来实现。所研究的数据的客观聚类感应技术的框架内分组分两个阶段进行。首先,研究了基因表达谱与使用DBSCAN聚类算法分组。然后,将获得的基因表达概况集被分成使用SOTA聚类算法两个集群。这一步一步的数据聚类的过程板条箱的条件,以节省更多的有用信息为以下的数据处理。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号