首页> 外文学位 >Clustering algorithms, classification algorithms and their applications in medical databases.

【24h】

Clustering algorithms, classification algorithms and their applications in medical databases.

机译：聚类算法，分类算法及其在医学数据库中的应用。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Data mining is a process of discovering hidden patterns and relationships in large databases using various techniques such as clustering and classification. Clustering is the process of discovering groups of data, such that the intra-cluster similarity is maximized and the inter-cluster similarity is minimized. Many of the partitional clustering algorithms such as PAM, CLARA and CLARANS have failed to identify natural clusters of arbitrary shapes and sizes. It is required by these algorithms to provide the number of clusters in advance, which is very difficult to identify for high dimensional, large data sets. Hierarchal clustering algorithms such as CURE and ROCK have been developed which have overcome the limitations of partitional algorithms. However, these algorithms are based on a static model and hence they have failed to discover natural clusters. CHAMELEON is a hierarchal clustering algorithm that measures the similarity of two clusters based on a dynamic model. In the clustering process, two clusters are merged only if the relative closeness and relative inter-connectivity between the two clusters are greater than the threshold values. This property ensures that the natural clusters with arbitrary shapes, sizes and densities are identified.; I have implemented CHAMELEON in Maple 8.0 using the C library available in METIS and hMETIS graph partitioning packages and the mining software developed by Dr. Quoc-Nam Tran, Associate Professor, Lamar University. I have achieved a considerable improvement of 35% in the time for executing the program on the benchmark data sets.; Classification, also called supervised clustering, is another data mining technique. In this technique, a model is constructed using a training data set, which is then tested and used for classifying records whose class labels are unknown. I have implemented classification using Gini and Entropy based approaches and applied the program on thrombosis data sets. I have also compared the results of both the approaches and identified interesting rules for classifying thrombosis into any of the four possible categories namely, type 0, type 1, type 2 and type 3.

机译：数据挖掘是使用诸如聚类和分类之类的各种技术在大型数据库中发现隐藏模式和关系的过程。群集是发现数据组的过程，以使群集内相似度最大化，而群集间相似度最小。许多分区聚类算法（例如PAM，CLARA和CLARANS）都无法识别任意形状和大小的自然聚类。这些算法要求预先提供簇的数量，这对于高维，大数据集很难识别。已经开发出克服了分区算法的局限性的分层聚类算法，例如CURE和ROCK。但是，这些算法基于静态模型，因此无法发现自然簇。 CHAMELEON是一种分层聚类算法，它基于动态模型测量两个聚类的相似性。在聚类过程中，仅当两个聚类之间的相对紧密度和相对互连性大于阈值时，才合并两个聚类。此属性可确保识别出具有任意形状，大小和密度的自然簇。我已经使用METIS和hMETIS图分区软件包中的C库以及Lamar大学副教授Quoc-Nam Tran博士开发的挖掘软件在Maple 8.0中实现了CHAMELEON。在基准数据集上执行该程序的时间缩短了35％。分类，也称为监督聚类，是另一种数据挖掘技术。在这种技术中，使用训练数据集构建模型，然后对该模型进行测试并用于对类别标签未知的记录进行分类。我已经使用基于基尼和熵的方法实现了分类，并将该程序应用于血栓形成数据集。我还比较了这两种方法的结果，并确定了将血栓形成分为4类，0类，1类，2类和3类中任何一种的有趣规则。

著录项

作者
Baddam, Sudheer R.;
展开▼
作者单位

Lamar University - Beaumont.;

展开▼
授予单位 Lamar University - Beaumont.;
学科 Computer Science.
学位 M.S.
年度 2005
页码 62 p.
总页数 62
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. ENHANCING CLUSTERING-BASED CLASSIFICATION ALGORITHMS IN E-COMMERCE APPLICATIONS [J] . AYMAN MOHAMED MOSTAFA, MOHAMED MAHER, M.M. HASSAN Journal of Theoretical and Applied Information Technology . 2018,第18期

机译：增强电子商务应用中基于聚类的分类算法
2. Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications [J] . Yiyan Zhang, Yi Xin, Qin Li, BioMedical Engineering OnLine . 2017,第1期

机译：七种数据挖掘算法对生物医学分类应用数据集不同特征的实证研究
3. Similarity-based attribute weighting methods via clustering algorithms in the classification of imbalanced medical datasets [J] . Polat Kemal Neural computing & applications . 2018,第3期

机译：基于相似性的属性加权方法，通过聚类算法在不平衡医疗数据集分类中
4. Parallelization of classification algorithms for medical imaging on a cluster computing system [C] . Daggett, T., Greenshields, . 1998

机译：集群计算系统上医学成像分类算法的并行化
5. An Ontological Approach and a Meta-Algorithm for Data and Machine Learning Algorithms Analysis and Classification with Application in the Medical Domain [D] . Chimmiri, Baby Sri Pravallika. 2018

机译：数据和机器学习算法分析和分类的本体方法和元算法及其在医学领域的应用
6. Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications [O] . Yiyan Zhang, Yi Xin, Qin Li, 2017

机译：七种数据挖掘算法在生物医学分类应用中不同数据集特征的实证研究
7. A separability index for clustering and classification problems with applications to cluster merging and systematic evaluation of clustering algorithms [O] . Peterson, Anna Dagmar 2011

机译：聚类和分类问题的可分离性指标及其在聚类合并和聚类算法的系统评估中的应用

Clustering algorithms, classification algorithms and their applications in medical databases.

摘要

著录项

相似文献

相关主题

期刊订阅