...
首页> 外文期刊>Journal of circuits, systems and computers >Balancing Assisted Reproductive Technology Dataset for Improving the Efficiency of Incremental Classifiers and Feature Selection Techniques
【24h】

Balancing Assisted Reproductive Technology Dataset for Improving the Efficiency of Incremental Classifiers and Feature Selection Techniques

机译:平衡辅助生殖技术数据集,用于提高增量分类器的效率和特征选择技术

获取原文
获取原文并翻译 | 示例
           

摘要

Assisted Reproductive Technology (ART) is a set of medical procedures primarily used to address infertility. Success Rate of ART is very low because it is affected by large number of variables. Machine Learning Techniques are now applied to predict ART outcome and to find strategies to improve success rate. For this, determining the best performing classifier for ART is very important. Previously, some classifiers are applied to ART with static data. But, in reality, the datasets are dynamic in nature and require dynamic setup which can be achieved with the help of Incremental Classifiers. Due to low success rate, the ART dataset contains less number of records for positive results that make the dataset imbalanced. This research work first finds the best evaluation metric for classification on imbalanced dataset and then balances the dataset using three different balancing techniques such as undersampling, oversampling and Synthetic Minority Oversampling Technique (SMOTE) and applies five different Incremental Classifiers, namely Stochastic Gradient Descent (SGD), Stochastic Primal Estimated subGrAdient SOlver for Support vector machine (SPegasos), Naive Bayes Updatable, Instance Based (IBk), Averaged One Dependence Estimators (AIDE) Updatable and finds the best balancing technique and suitable classifier for ART outcome prediction. The result shows that for an imbalanced dataset Receiver Operating Characteristics (ROC) Area may be taken as a metric instead of the accuracy. It is found that SMOTE is best method for balancing the ART dataset and IB1 classifier performs well for the balanced data with the high prediction rate of 92.3 for ROC. Finally, various Feature Selection methods are applied to the top three best performing classifiers and suitable feature selection method for each classifier is identified.
机译:辅助生殖技术(艺术品)是一组主要用于解决不孕症的医疗程序。成功率很低,因为它受大量变量的影响。现在应用机器学习技术来预测艺术结果,并找到提高成功率的策略。为此,确定最佳的艺术分类器非常重要。以前,一些分类器应用于具有静态数据的艺术。但是,实际上,数据集本质上是动态的,并且需要在增量分类器的帮助下实现动态设置。由于成功率低,Art DataSet包含少量记录,用于使数据集更加不平衡。本研究首先发现了在不平衡数据集上分类的最佳评估度量,然后使用三种不同的平衡技术余额,例如under采样,过采样和合成少数群体过采样技术(Smote),并应用五种不同的增量分类器,即随机梯度下降(SGD ),用于支持向量机(Spegasos)的随机原始估计的子辐射求解器,基于实例(IBK),平均一个依赖性估计器(AIDE)可更新,并找到最佳平衡技术和用于艺术结果预测的合适分类器。结果表明,对于不平衡数据集接收机,操作特性(ROC)区域可以作为指标而不是精度。发现SMOTE是平衡的最佳方法,用于平衡ART DataSet,IB1分类器对于ROC的高预测率为92.3的高预测速率对平衡数据进行良好。最后,将各种特征选择方法应用于顶三个最佳执行分类器,并且识别每个分类器的合适的特征选择方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号