首页> 外文会议>International Conference on Computer Applications in Industry and Engineering >Functional Analytic Unsupervised and Supervised Data Mining Technology (FAUST)
【24h】

Functional Analytic Unsupervised and Supervised Data Mining Technology (FAUST)

机译:功能分析无监督和监督数据挖掘技术(浮士电)

获取原文

摘要

pTree technology represent and process data differently from the ubiquitous horizontal data technologies. In pTree technology, the data is structured column-wise and the columns are processed horizontally (typically across a few to a few hundred columns), while in horizontal technologies, data is structured row-wise and those rows are processed vertically (often down millions, even billions of rows). pTree technology is a vertical data technology. P-trees are lossless, compressed and data-mining ready data structures [9][10]. pTrees are lossless because the vertical bit-wise partitioning that is used in the pTree technology guarantees that all information is retained completely. There is no loss of information in converting horizontal data to this vertical format. pTrees are compressed because in this technology, segments of bit sequences which are either purely 1-bits or purely 0-bits, are represented by a single bit. This compression saves a considerable amount of space, but more importantly facilitates faster processing. pTrees are data-mining ready because the fast, horizontal data mining processes involved can be done without the need to decompress the structures first. pTree vertical data structures have been exploited in various domains and data mining algorithms, ranging from classification {1,2,3], clustering [4,7], association rule mining [9], as well as other data mining algorithms. PTree technology is patented in the U.S. Speed improvements are very important in data mining because many quite accurate algorithms require an unacceptable amount of processing time to complete, even with today's powerful computing systems and efficient software platforms. In this paper, we evaluate and compare the speed of various clustering data mining algorithms when using pTree technology.
机译:PTREE技术与普遍存在的水平数据技术不同地代表和处理数据。在PTree技术中,数据是结构列的,列水平处理(通常跨越几百列),而在水平技术中,数据是结构的行,这些行被垂直处理(通常会垂直处理数百万,甚至数十亿的行)。 Ptree技术是一种垂直数据技术。 P树是无损,压缩和数据挖掘就绪数据结构[9] [10]。 PTREE是无损的,因为PTREE技术中使用的垂直比特明智的分区保证了所有信息完全保留。在将水平数据转换为此垂直格式时不会丢失信息。 PTREE被压缩,因为在该技术中,纯粹是1比特或纯0位的位序列的段由单位表示。这种压缩节省了相当数量的空间,但更重要的是促进更快的处理。 PTREE是数据挖掘的准备好,因为可以在没有必要首先解压缩结构的情况下进行快速,水平数据挖掘过程。 Ptree垂直数据结构已在各个域和数据挖掘算法中被利用,从分类{1,2,3],聚类[4,7],关联规则挖掘[9]以及其他数据挖掘算法。 Ptree技术在U.S中获得专利。速度改进在数据挖掘中非常重要,因为许多相当准确的算法需要一个不可接受的处理时间来完成,即使与今天的强大的计算系统和高效的软件平台也是如此。在本文中,我们在使用PTREE技术时评估和比较各种聚类数据挖掘算法的速度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号