A two-stage discretization algorithm based on information entropy

Wen Liu-Ying; Min Fan; Wang Shi-Yuan

首页> 外文期刊>Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies >A two-stage discretization algorithm based on information entropy

【24h】

A two-stage discretization algorithm based on information entropy

机译：基于信息熵的两级离散化算法

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Discretization is an important and difficult preprocessing task for data mining and knowledge discovery. Although there are numerous discretization approaches, many suffer from certain drawbacks. Local approaches are efficient, but their generalization ability is weak. Global approaches consider all attributes simultaneously, but they have high time and space complexities. In this paper, we propose a two-stage discretization (TSD) algorithm based on information entropy. In the local discretization stage, we independently select k strong cuts for each attribute to minimize conditional entropy. The goal is to rapidly reduce the cardinality of the attributes, with minor information loss. In the global discretization stage, cuts for all attributes are considered simultaneously to form a scaled decision system. The minimal cut set that preserves the positive region is finally selected. We tested the new algorithm and seven popular algorithms on 28 datasets. Compared with other approaches, our algorithm has the best generalization ability, with a good information preserving ability, the highest classification accuracy, and reasonable time consumption.

机译：离散化是数据挖掘和知识发现的重要和困难的预处理任务。虽然有许多离散化方法，但许多人遭受某些缺点。局部方法是有效的，但它们的泛化能力很弱。全局方法同时考虑所有属性，但它们具有高时间和空间复杂性。在本文中，我们提出了一种基于信息熵的两级离散化（TSD）算法。在局部离散化阶段，我们为每个属性独立选择k强切割，以最大限度地减少条件熵。目标是迅速减少属性的基数，小信息丢失。在全局离散化阶段，同时考虑所有属性的削减以形成缩放决策系统。最终选择保留正区域的最小切割集。我们在28个数据集上测试了新的算法和七个流行算法。与其他方法相比，我们的算法具有最佳的泛化能力，具有良好的信息保存能力，最高分类准确性，以及合理的时间消耗。

著录项

来源
《Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies 》 |2017年第4期| 共17页
作者
Wen Liu-Ying; Min Fan; Wang Shi-Yuan;
展开▼
作者单位

Southwest Petr Univ Sch Comp Sci Chengdu 610500 Sichuan Peoples R China;

Southwest Petr Univ Sch Comp Sci Chengdu 610500 Sichuan Peoples R China;

Southwest Petr Univ Sch Comp Sci Chengdu 610500 Sichuan Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术 ;
关键词
Classification; Discretization; Information entropy; Real-value attribute; Scaling;

机译：分类;离散化;信息熵;实际价值属性;缩放;

相似文献

外文文献
中文文献
专利

1. A two-stage discretization algorithm based on information entropy [J] . Wen Liu-Ying, Min Fan, Wang Shi-Yuan Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2017 ,第4期

机译：基于信息熵的两级离散化算法
2. Discrete Bird Swarm Algorithm Based on Information Entropy Matrix for Traveling Salesman Problem [J] . Lin Min, Zhong Yiwen, Lin Juan, Mathematical Problems in Engineering . 2018 ,第PTa14期

机译：基于信息熵矩阵的离散鸟群旅行商问题
3. Discrete Bird Swarm Algorithm Based on Information Entropy Matrix for Traveling Salesman Problem [J] . Min Lin, Yiwen Zhong, Juan Lin, Mathematical Problems in Engineering: Theory, Methods and Applications . 2018 ,第5期

机译：基于信息熵矩阵的离散鸟群旅行商问题
4. Maximum Entropy Based Structure Two-Stage Optimum Design with Discrete Variables [C] . Haiwen Teng, Da Huo Computational Mechanics . 2004

机译：基于最大熵的离散变量结构两阶段优化设计
5. Statistical machine translation: Maximum entropy based translation models and search algorithms. [D] . Garcia Varea, Ismael. 2003

机译：统计机器翻译：基于最大熵的翻译模型和搜索算法。
6. Calculation of Five Thermodynamic Molecular Descriptors by Means of a General Computer Algorithm Based on the Group-Additivity Method: Standard Enthalpies of Vaporization Sublimation and Solvation and Entropy of Fusion of Ordinary Organic Molecules and Total Phase-Change Entropy of Liquid Crystals [O] . Rudolf Naef, William E. Acree Jr. 2017

机译：利用基于群可加法的通用计算机算法计算五个热力学分子描述词：汽化升华和溶剂化的标准焓普通有机分子的融合熵和液晶的总相变熵
7. Design and analysis of Discrete Cosine Transform-based watermarking algorithms for digital images. Development and evaluation of blind Discrete Cosine Transform-based watermarking algorithms for copyright protection of digital images using handwritten signatures and mobile phone numbers. [O] . Al-Gindy Ahmed M. N. 2011

机译：基于离散余弦变换的数字图像水印算法设计与分析。开发和评估基于盲离散余弦变换的水印算法，用于使用手写签名和手机号码保护数字图像。

A two-stage discretization algorithm based on information entropy

摘要

著录项

相似文献

相关主题

期刊订阅