The Role of Biomedical Dataset in Classification

机译：生物医学数据集在分类中的作用

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we investigate the role of a biomedical dataset on the classification accuracy of an algorithm. We quantify the complexity of a biomedical dataset using five complexity measures: correlation-based feature selection subset merit, noise, imbalance ratio, missing values and information gain. The effect of these complexity measures on classification accuracy is evaluated using five diverse machine learning algorithms: J48 (decision tree), SMO (support vector machines), Naive Bayes (probabilistic), IBk (instance based learner) and JRIP (rule-based induction). The results of our experiments show that noise and correlation-based feature selection subset merit - not a particular choice of algorithm - play a major role in determining the classification accuracy. In the end, we provide researchers with a meta-model and an empirical equation to estimate the classification potential of a dataset on the basis of its complexity. This well help researchers to efficiently pre-process the dataset for automatic knowledge extraction.

机译：在本文中，我们研究了生物医学数据集对算法分类准确性的作用。我们使用五个复杂性度量来量化生物医学数据集的复杂性：基于相关性的特征选择子集优点，噪声，失衡比，缺失值和信息增益。使用五种不同的机器学习算法评估了这些复杂性度量对分类准确性的影响：J48（决策树），SMO（支持向量机），朴素贝叶斯（概率），IBk（基于实例的学习者）和JRIP（基于规则的归纳））。我们的实验结果表明，基于噪声和相关性的特征选择子集优点-不是特定的算法选择-在确定分类精度中起着重要作用。最后，我们为研究人员提供了一个元模型和一个经验方程式，以根据数据集的复杂性估算数据集的分类潜力。这很好地帮助研究人员有效地预处理了数据集以进行自动知识提取。

著录项

来源
《Artificial intelligence in medicine》|2009年|370-374|共5页
会议地点 Verona(IT);Verona(IT)
作者
Ajay Kumar Tanwani; Muddassar Farooq;
展开▼
作者单位

Next Generation Intelligent Networks Research Center (nexGIN RC) National University of Computer Emerging Sciences (FAST-NUCES) Islamabad, 44000, Pakistan;

Next Generation Intelligent Networks Research Center (nexGIN RC) National University of Computer Emerging Sciences (FAST-NUCES) Islamabad, 44000, Pakistan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类人工智能理论;人体工程学;医用物理学;
关键词
classification; complexity measures; biomedical datasets;

机译：分类;复杂性度量；生物医学数据集;
入库时间 2022-08-26 14:30:49

相似文献

外文文献
中文文献
专利

1. A contemporary feature selection and classification framework for imbalanced biomedical datasets [J] . Thulasi Bikku, Sambasiva Rao Nandam, Ananda Rao Akepogu Egyptian Informatics Journal . 2018,第3期

机译：不平衡生物医学数据集的当代特征选择和分类框架
2. Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications [J] . Yiyan Zhang, Yi Xin, Qin Li, BioMedical Engineering OnLine . 2017,第1期

机译：七种数据挖掘算法对生物医学分类应用数据集不同特征的实证研究
3. Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification [J] . Jinyan Li, Simon Fong, Yunsick Sung, BioData Mining . 2016,第1期

机译：生物医学数据分类中基于二元不平衡数据集的自适应群聚动态多目标合成少数过采样技术算法
4. The Role of Biomedical Dataset in Classification [C] . Ajay Kumar Tanwani, Muddassar Farooq Conference on Artificial Intelligence in Medicine . 2009

机译：生物医学数据集在分类中的作用
5. Classification and Dimensional Reduction Algorithms for Very Large Biomedical Datasets [D] . Li, Huamin. 2017

机译：超大型生物医学数据集的分类和降维算法
6. Empirical study of seven data mining algorithms on different characteristics of datasets for biomedical classification applications [O] . Yiyan Zhang, Yi Xin, Qin Li, 2017

机译：七种数据挖掘算法在生物医学分类应用中不同数据集特征的实证研究
7. Classification Potential vs Classification Accuracy: A Comprehensive Study of Evolutionary Algorithms with Biomedical Datasets [O] . Ajay Kumar Tanwani, Muddassar Farooq 2015

机译：分类潜力与分类准确性：生物医学数据集进化算法的综合研究

The Role of Biomedical Dataset in Classification

摘要

著录项

相似文献

相关主题

期刊订阅