【24h】

Study and Implementation of Clustering Algorithms in R

机译:r的聚类算法研究与实现

获取原文

摘要

clustering is a process in which we group the data by finding similarities between data based on their characteristics. These groups are called cluster. In clustering, there is a division of data into groups of similar objects. These groups are the clusters, consists of objects that are similar between themselves and dissimilar compared to objects of other groups. Clustering is unsupervised learning technique, based on the concept of maximize intra-clustering and minimize inter- clustering. Nowadays, clustering of biological dataset is the widely researched topic among computer science. Bio- informatics has become area that receive most of the attention of data mining techniques. Generally, bio- informatics targets to solve complicated problems like gene categorization and its functionality, gene expression analysis of data obtained from micro- array experiments etc. These clustering techniques are addressed with R. Clustering techniques are used to analyze the structure of biological data. There are many different methods but we study k- means, Hierarchical and Density- based clustering algorithm for Biological Data using R programming tool.
机译:聚类是一个过程,其中我们组通过发现基于它们的特性的数据之间的相似性的数据。这些组称为集群。在聚类,存在数据划分成类似对象的群。这些基团是簇,由相对于其他组的对象本身是异种之间类似的对象。聚类分析是无监督学习技术的基础上,最大化内集群的概念,并尽量减少跨集群。如今,生物数据集的聚类是计算机科学中广泛研究的话题。生物信息学已成为接收大多数的数据挖掘技术的关注区域。样基因分类和它的功能性,从微阵列实验等,这些聚类技术与R.聚类技术寻址而获得的数据的基因表达分析通常,生物信息学的目标,以解决复杂的问题是用来分析生物数据的结构。有许多不同的方法,但我们研究k均值,层次和密度 - 使用生物数据R编程工具基于聚类算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号