...
首页> 外文期刊>Journal of International Technology and Information Management >Entropy Based Feature Selection For Multi-Relational Na?ve Bayesian Classifier
【24h】

Entropy Based Feature Selection For Multi-Relational Na?ve Bayesian Classifier

机译:多元朴素贝叶斯分类器的基于熵的特征选择

获取原文
           

摘要

Current industries data’s are stored in relation structures. In usual approach to mine these data, we often use to join several relations to form a single relation using foreign key links, which is known as flatten. Flatten may cause troubles such as time consuming, data redundancy and statistical skew on data. Hence, the critical issues arise that how to mine data directly on numerous relations. The solution of the given issue is the approach called multi-relational data mining (MRDM). Other issues are irrelevant or redundant attributes in a relation may not make contribution to classification accuracy. Thus, feature selection is an essential data pre- processing step in multi-relational data mining. By filtering out irrelevant or redundant features from relations for data mining, we improve classification accuracy, achieve good time performance, and improve comprehensibility of the models. We had proposed the entropy based feature selection method for Multi-relational Na?ve Bayesian Classifier. We have use method InfoDist and Pearson’s Correlation parameters, which will be used to filter out irrelevant and redundant features from the multi-relational database and will enhance classification accuracy. We analyzed our algorithm over PKDD financial dataset and achieved the better accuracy compare to the existing features selection methods.
机译:当前行业数据存储在关系结构中。在挖掘这些数据的常用方法中,我们经常使用外键链接来合并多个关系以形成单个关系,这称为扁平化。展平可能会导致麻烦,例如耗时,数据冗余和数据统计偏差。因此,出现了关键问题,即如何直接在众多关系上挖掘数据。给定问题的解决方案是称为多关系数据挖掘(MRDM)的方法。其他问题无关紧要,或者关系中的冗余属性可能不会有助于分类准确性。因此,特征选择是多关系数据挖掘中必不可少的数据预处理步骤。通过从关系中过滤掉无关或冗余的特征进行数据挖掘,我们提高了分类准确性,实现了良好的时间性能,并提高了模型的可理解性。我们提出了用于多关系朴素贝叶斯分类器的基于熵的特征选择方法。我们使用了InfoDist和Pearson的Correlation方法的相关参数,这些参数将用于从多关系数据库中过滤掉无关和多余的特征,并提高分类的准确性。通过对PKDD金融数据集的算法分析,与现有特征选择方法相比,该算法具有更好的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号