Explores a promising data mining approach. Despite the small number of examples available in the authors' application domain (taking into account the large number of attributes), the results of their experiments can be considered very promising. The discovered rules had good performance concerning predictive accuracy, considering both the rule set as a whole and each individual rule. Furthermore, what is more important from a data mining viewpoint, the system discovered some comprehensible rules. It is interesting to note that the system achieved very consistent results by working from "tabula rasa," without any background knowledge, and with a small number of examples. The authors emphasize that their system is still in an experiment in the research stage of development. Therefore, the results presented here should not be used alone for real-world diagnoses without consulting a physician. Future research includes a careful selection of attributes in a preprocessing step, so as to reduce the number of attributes (and the corresponding search space) given to the GP. Attribute selection is a very active research area in data mining. Given the results obtained so far, GP has been demonstrated to be a really useful data mining tool, but future work should also include the application of the GP system proposed here to other data sets, to further validate the results reported in this article.
展开▼
机译:探索一种有前途的数据挖掘方法。尽管在作者的应用程序领域中可用的示例数量很少(考虑到大量的属性),但他们的实验结果仍被认为是非常有前途的。考虑到整个规则集和每个单独的规则,发现的规则在预测准确性方面均具有良好的性能。此外,从数据挖掘的角度来看更重要的是,系统发现了一些可理解的规则。有趣的是,该系统通过使用“ tabula rasa”进行工作而获得非常一致的结果,没有任何背景知识,并且仅包含少量示例。作者强调说,他们的系统仍处于开发研究阶段的试验阶段。因此,在不咨询医生的情况下,不应单独将此处介绍的结果用于实际诊断。未来的研究包括在预处理步骤中仔细选择属性,以减少分配给GP的属性数量(以及相应的搜索空间)。属性选择是数据挖掘中非常活跃的研究领域。鉴于到目前为止获得的结果,GP已被证明是一个非常有用的数据挖掘工具,但是未来的工作还应包括将此处提出的GP系统应用于其他数据集,以进一步验证本文中报告的结果。
展开▼