SLIQ: A Fast Scalable Classifier for Data Mining

机译：SLIQ：用于数据挖掘的快速可扩展分类器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Classification is an important problem in the emerging field of data mining. although classification has been studied extensively in the past, most of the classification algorithms are designed only for memory-resident data, thus limitng their suitability for data moning larger data sets. This paper discusses issues in builidng a scalable classifier and presents the design of SLIQ~1, a new classifier. SLIQ is a decision tree classifier that can handle both numeric and categorical attributes. It uses a novel pre-sorting technique in the tree-growth phase. This sorting procedure is integrated with a breadth-first tree growing strategy to enable classification of disk-resident datasets. SLIQ also uses a new tree-pruning algorithm that is inexpensive, and results in compact and accurate trees. The combination of these techniques enables SLIQ to scale for large data sets and classify data sets irrespective of the number of classes, attributes, and examples (records), thus making it an attractive tool for data mining.

机译：分类是新兴的数据挖掘领域中的一个重要问题。尽管过去已经对分类进行了广泛的研究，但是大多数分类算法仅针对驻留内存的数据而设计，因此限制了它们适用于监视较大数据集的数据的适用性。本文讨论了构建可扩展分类器的问题，并提出了一种新的分类器SLIQ〜1的设计。 SLIQ是一种决策树分类器，可以处理数字和分类属性。它在树的生长阶段使用了一种新颖的预分类技术。此排序过程与“广度优先”的树生长策略集成在一起，可以对磁盘驻留数据集进行分类。 SLIQ还使用了一种新的树修剪算法，该算法便宜，并且可以生成紧凑而准确的树。这些技术的结合使SLIQ可以缩放大型数据集并对数据集进行分类，而无需考虑类，属性和示例（记录）的数量，因此使其成为有吸引力的数据挖掘工具。

著录项

来源
《International conference on extending database technology;EDBT'96》|1996年|p.18-32|共15页
会议地点
作者
Manish Mehta; Rakesh Agrawal; Jorma Rissanen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Power set kernel for feature combination: data mining approach for its fast classifiers [J] . Taku Kudo, Yuji Matsumoto 電子情報通信学会技術研究報告. 人工知能と知識処理. Artificial Intelligence and Knowledge Based Processing . 2002,第711期

机译：用于功能组合的Power Set内核：快速分类器的数据挖掘方法
2. Power set kernel for feature combination: data mining approach for its fast classifiers [J] . Taku Kudo, Yuji Matsumoto 電子情報通信学会技術研究報告. 人工知能と知識処理. Artificial Intelligence and Knowledge Based Processing . 2002,第711期

机译：功能组合的电源集内核：其快速分类器的数据挖掘方法
3. FastMFDs: a fast, efficient algorithm for mining minimal functional dependencies from large-scale distributed data with Spark [J] . Cheng Feng, Yang Zhe Journal of supercomputing . 2019,第5期

机译：FastMFDs：一种快速有效的算法，可通过Spark从大型分布式数据中挖掘最小的功能依赖性
4. SLIQ: A Fast Scalable Classifier for Data Mining [C] . Manish Mehta, Rakesh Agrawal, Jorma Rissanen International conference on extending database technology . 1996

机译：SLIQ：用于数据挖掘的快速可扩展分类器
5. A comparative study: Utilizing data mining techniques to classify traffic congestion status. [D] . Mirakhorli, Abbas. 2014

机译：一项比较研究：利用数据挖掘技术对交通拥堵状况进行分类。
6. Data mining: The association of 2‐h postprandial plasma glucose with the fasting plasma glucose in a large Chinese population [O] . Dandan Sun, Dandan Li, Songlin Yu, 2020

机译：数据挖掘：2-H后血浆葡萄糖与大型中国人口中的空腹血浆葡萄糖的关联
7. SLIQ: A Fast Scalable Classifier for Data Mining [O] . Manish Mehta, Rakesh Agrawal, Jorma Rissanen 1996

机译：SLIQ：用于数据挖掘的快速可扩展分类器

SLIQ: A Fast Scalable Classifier for Data Mining

摘要

著录项

相似文献

相关主题

期刊订阅