SLIQ: A Fast Scalable Classifier for Data Mining

机译：SLIQ：用于数据挖掘的快速可扩展分类器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Classification is an important problem in the emerging field of data mining. although classification has been studied extensively in the past, most of the classification algorithms are designed only for memory-resident data, thus limitng their suitability for data moning larger data sets. This paper discusses issues in builidng a scalable classifier and presents the design of SLIQ~1, a new classifier. SLIQ is a decision tree classifier that can handle both numeric and categorical attributes. It uses a novel pre-sorting technique in the tree-growth phase. This sorting procedure is integrated with a breadth-first tree growing strategy to enable classification of disk-resident datasets. SLIQ also uses a new tree-pruning algorithm that is inexpensive, and results in compact and accurate trees. The combination of these techniques enables SLIQ to scale for large data sets and classify data sets irrespective of the number of classes, attributes, and examples (records), thus making it an attractive tool for data mining.

机译：分类是新出现的数据挖掘领域的重要问题。尽管分类已经在过去广泛的研究，大部分的分类算法只专为内存驻留数据，从而limitng其数据moning更大的数据集适用性。本文讨论了Builidng A可伸缩分类器的问题，并呈现了SLIQ〜1的设计，一个新分类器。 SLIQ是一个决策树分类器，可以处理数字和分类属性。它在树 - 生长阶段使用新的预分类技术。此排序过程与广度一棵树越来越多的策略集成，以实现磁盘驻留数据集的分类。 SLIQ还使用一种廉价的新树修剪算法，并导致紧凑且精确的树木。这些技术的组合使SLIQ能够为大数据集进行规模，并且不论类，属性和示例（记录）的数量，则对数据集进行分类，从而使其成为数据挖掘的有吸引力的工具。

著录项

来源
《International conference on extending database technology》|1996年||共15页
会议地点
作者
Manish Mehta; Rakesh Agrawal; Jorma Rissanen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类各种专用数据库;
关键词

相似文献

外文文献
中文文献
专利

1. Power set kernel for feature combination: data mining approach for its fast classifiers [J] . Taku Kudo, Yuji Matsumoto 電子情報通信学会技術研究報告. 人工知能と知識処理. Artificial Intelligence and Knowledge Based Processing . 2002,第711期

机译：用于功能组合的Power Set内核：快速分类器的数据挖掘方法
2. Power set kernel for feature combination: data mining approach for its fast classifiers [J] . Taku Kudo, Yuji Matsumoto 電子情報通信学会技術研究報告. 人工知能と知識処理. Artificial Intelligence and Knowledge Based Processing . 2002,第711期

机译：功能组合的电源集内核：其快速分类器的数据挖掘方法
3. FastMFDs: a fast, efficient algorithm for mining minimal functional dependencies from large-scale distributed data with Spark [J] . Cheng Feng, Yang Zhe Journal of supercomputing . 2019,第5期

机译：FastMFDs：一种快速有效的算法，可通过Spark从大型分布式数据中挖掘最小的功能依赖性
4. SLIQ: A Fast Scalable Classifier for Data Mining [C] . Manish Mehta, Rakesh Agrawal, Jorma Rissanen International conference on extending database technology;EDBT'96 . 1996

机译：SLIQ：用于数据挖掘的快速可扩展分类器
5. A comparative study: Utilizing data mining techniques to classify traffic congestion status. [D] . Mirakhorli, Abbas. 2014

机译：一项比较研究：利用数据挖掘技术对交通拥堵状况进行分类。
6. Data mining: The association of 2‐h postprandial plasma glucose with the fasting plasma glucose in a large Chinese population [O] . Dandan Sun, Dandan Li, Songlin Yu, 2020

机译：数据挖掘：2-H后血浆葡萄糖与大型中国人口中的空腹血浆葡萄糖的关联
7. SLIQ: A Fast Scalable Classifier for Data Mining [O] . Manish Mehta, Rakesh Agrawal, Jorma Rissanen 1996

机译：SLIQ：用于数据挖掘的快速可扩展分类器

SLIQ: A Fast Scalable Classifier for Data Mining

摘要

著录项

相似文献

相关主题

期刊订阅