首页> 外文学位 >Feature selection methods for intelligent systems classifiers in healthcare.
【24h】

Feature selection methods for intelligent systems classifiers in healthcare.

机译:医疗保健中智能系统分类器的特征选择方法。

获取原文
获取原文并翻译 | 示例

摘要

Data mining uses a variety of techniques to detect patterns related to health states and outcomes, which are not easily detected using traditional statistical methods. Feature selection is the step where clusters of potentially important variables are identified. The study examined feature selection methods for an intelligent systems classifier (ICS). An ICS is a computer system that learns. The example used for classification was binary self-reported activity status relative to others of the same age and gender. The sample was a dataset of the 20,050 adult cases of the NHANES III national health survey. The independent variables were feature selection method (filter and wrapper) and psychosocial feature inclusion status (ablated or unablated). The dependent variable was classification performance. The method was to generate four feature sets using genetic algorithms, build a neural network with each feature set, test the classification performance of each neural network, and compare the classification performance. Using error rate, a contingency matrix, and area under the receiver operating characteristic curve (AUROC), descriptively there appeared to be no difference between sets, when the classes were combined. However, sensitivity (positive class 1) was lower than specificity in all four sets. When the psychosocial features (unablated) were included in the search space along with laboratory values and physical exam, the GA wrapper (non-linear) performed better than the filter (linear). The number of features were reduced by approximately 90%. A Z statistic was done and showed that there was a statistically significant difference in the AUROC for each set when compared to random classification performance (p = .000, SE = .008, two-tailed, non-parametric, n = 5048), but no set of pairs were significantly different, with alpha at 0.05. A search of healthcare indexes returned no entries of the NHANES III used with any artificial intelligence method. Genetic algorithms were empirically shown to be useful for feature selection with the highly dimensional, mixed, diverse data produced by healthcare. A genetic algorithm was not found reported in the nursing research literature.
机译:数据挖掘使用多种技术来检测与健康状态和结果相关的模式,而使用传统的统计方法很难检测到这些模式。特征选择是确定潜在重要变量簇的步骤。这项研究检查了智能系统分类器(ICS)的特征选择方法。 ICS是学习的计算机系统。用于分类的示例是相对于相同年龄和性别的其他人的二进制自我报告活动状态。该样本是NHANES III国家健康调查的20,050名成人病例的数据集。自变量是特征选择方法(过滤器和包装器)和社会心理特征包含状态(消融或未消融)。因变量是分类性能。该方法是使用遗传算法生成四个特征集,使用每个特征集构建一个神经网络,测试每个神经网络的分类性能,并比较分类性能。使用错误率,列联矩阵和接收器工作特性曲线(AUROC)下的面积,描述性地讲,当组合类别时,集合之间似乎没有差异。然而,敏感性(1级阳性)在所有四组中均低于特异性。当心理社会特征(未消除)连同实验室值和体格检查一起包括在搜索空间中时,GA包装器(非线性)的效果​​要好于过滤器(线性)。特征数量减少了约90%。进行了AZ统计,并显示与随机分类性能相比,每组的AUROC有统计上的显着差异(p = .000,SE = .008,双尾,非参数,n = 5048),但是两组对之间无显着差异,α为0.05。对医疗保健指数的搜索未返回任何用于任何人工智能方法的NHANES III条目。经验表明,遗传算法可用于医疗保健产生的高度维​​度,混合,多样的数据的特征选择。在护理研究文献中未发现遗传算法的报道。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号