Incremental Information Mining

机译：增量信息挖掘

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Decision Tree Classification is a simple and important mining function. Decision tree algorithms are computationally intensive, yet do not capture the evolutionary trends from incremental data repository. In conventional mining approaches, if two or more datasets are to be merged to get a single target dataset, the entire computation for constructing a classifier has to be carried out all over again. Previous work in this field has been to construct individual decision tree classifiers and merge them by taking a voted arbitration or by merging the corresponding decision rules. We have attempted a new approach by data-preprocessing the individual windows of the growing database and we call them as Knowledge Concentrates-KC of the respective database windows. ie we propose to extract information from the different windows of the evolving database and store the different windows as Knowledge Concentrates KC. The formation of the KCs is done in the off-line mode. In the mining operations we use the KCs, instead of using the entire past data, thereby reducing the time and space complexity of the mining process. The user dynamically selects the target dataset by identifying the windows of interest. The mining requirement is satisfied by merging the respective KCs and running the decision tree algorithm on the merged KC. The proposed system operates in three phases. The first phase is the planning phase wherein the dataset domain information is gathered and the datamining goals are defined. The second phase makes a single scan on a Window in the database and generates a summary of this window as a Knowledge Concentrate (KC). In our work we have used an efficient dynamic Trie structure to store the KCs. The third phase merges the desired windows (KCs) and applies the Classification algorithm on the aggregate of the KCs to give the final required classifier. The salient issues addressed in the work are to form a condensed form of the database which enables in the extraction of the patterns in the database that are input to a decision making algorithm to form the required decision tree. The entire scheme is decision tree algorithm independent, in the sense that a user has flexibility to use any standard decision tree algorithm.

机译：决策树分类是一种简单而重要的挖掘功能。决策树算法是计算密集型的，但不会从增量数据存储库中捕获进化趋势。在传统的挖掘方法中，如果要合并两个或更多个数据集以获取单个目标数据集，则必须再次执行用于构建分类器的整个计算。以前的工作在此领域一直是构建单个决策树分类器，并通过拨出投票的仲裁或通过合并相应的决策规则来合并它们。我们通过数据预处理越来越多数据库的各个窗口尝试了一种新方法，并将其称为相应数据库Windows的知识集中kc。即，我们建议从不断发展的数据库的不同窗口中提取信息，并将不同的窗口存储为知识集中kc。 KCS的形成是在离线模式下完成的。在挖掘操作中，我们使用KCS，而不是使用整个过去的数据，从而减少挖掘过程的时间和空间复杂度。用户通过识别感兴趣的窗口动态地选择目标数据集。通过合并相应的KC并在合并的KC上运行决策树算法来满足挖掘要求。所提出的系统在三个阶段运行。第一阶段是规划阶段，其中收集数据集域信息并定义了DataMining目标。第二阶段在数据库中的窗口上进行单次扫描，并作为知识集中权限（KC）生成此窗口的摘要。在我们的工作中，我们使用了一个高效的动态Trie结构来存储KCS。第三阶段合并所需的窗口（KCS），并在KC的聚合中应用分类算法以提供最终所需的分类器。在工作中解决的突出问题是形成数据库的浓缩形式，这在提取数据库中的模式中，该模式被输入到决策算法以形成所需的决策树。整个方案是决策树算法独立于用户具有使用任何标准决策树算法的灵活性。

著录项

来源
《Society of Photo-Optical Instrumentation Engineers Conference on Data Mining and Knowledge Discovery》|2002年||共12页
会议地点
作者
SK Gupta; Sqn Ldr P Suresh; Vasudha Bhatnagar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 N-532;
关键词

相似文献

外文文献
中文文献
专利

1. Frequent mining analysis using pattern mining utility incremental algorithm based on relational query process [J] . Kumar B. Praveen, Paulraj D. Journal of ambient intelligence and humanized computing . 2021,第5期

机译：基于关系查询过程的模式挖掘实用程序增量算法频繁采用分析
2. A Novel Incremental Mining of Frequent Patterns for Web Mining [J] . DONG Yihong, ZHUANG Yueting, TAI Xiaoying Wuhan University Journal of Natural Sciences . 2007,第5期

机译：Web挖掘频繁模式的新型增量挖掘
3. A FUZZY DATA MINING ALGORITHM FOR INCREMENTAL MINING OF QUANTITATIVE SEQUENTIAL PATTERNS [J] . R. B. V. SUBRAMANYAM, A. GOSWAMI International Journal of Uncertainty, Fuzziness, and Knowledge-based Systems . 2005,第6期

机译：定量时序模式增量挖掘的模糊数据挖掘算法
4. Mining Backbone Literals in Incremental SAT: A New Kind of Incremental Data [C] . Alexander Ivrii, Vadim Ryvchin, Ofer Strichman International conference on theory and applications of satisfiability testing . 2015

机译：增量SAT中挖掘主干字面量：一种新型的增量数据
5. New Approaches to Frequent and Incremental Frequent Pattern Mining [D] . Bicer, Mehmet. 2020

机译：频繁和增量频繁模式挖掘的新方法
6. Designing a Streaming Algorithm for Outlier Detection in Data Mining—An Incremental Approach [O] . Kangqing Yu, Wei Shi, Nicola Santoro 2020

机译：设计用于数据挖掘中异常值检测的流算法—一种增量方法
7. Incremental Data Mining Using Concurrent Online Refresh of Materialized Data Mining Views [O] . Tadeusz Morzy, Marek Wojciechowski, Maciej Zakrzewicz 2008

机译：使用物化数据挖掘视图的并发在线刷新进行增量数据挖掘

Incremental Information Mining

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅