Approximate Data Mining using Sketches for Massive Data

机译：近似数据挖掘使用草图进行大规模数据

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the popularity of the Web and Internet, massive data is generated.However, this enormous datasets present the challenge to apply data mining techniques in order to extract useful information. Dimensionality reduction can be used to improve both efficiency and effectiveness while extracting information from data. In this paper we have proposed an algorithm to reduce the dimensionality of the datasets such that after applying data mining techniques on reduced datasets we get almost same results as with the original datasets. Random Sketch is used to reduce the dimensions of the dataset.

机译：随着Web和Internet的普及，生成了大规模数据。然而，这种巨大的数据集呈现了应用数据挖掘技术的挑战，以便提取有用的信息。可以使用维数减少来提高从数据中提取信息的同时提高效率和有效性。在本文中，我们提出了一种算法来降低数据集的维度，使得在将数据挖掘技术上应用于缩小的数据集之后，我们将与原始数据集一起获得几乎相同的结果。随机草图用于减少数据集的尺寸。

著录项

来源
《International Conference on Computational Intelligence Modeling Techniques and Applications》|2014年||共7页
会议地点
作者
Parul Gupta; Swati Agnihotri; Suman Saha;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类可计算性理论;
关键词
Dimension Reduction; Random Sketch.;

机译：尺寸减少;随机素描。;

相似文献

外文文献
中文文献
专利

1. Synopses for Massive Data:Samples,Histograms,Wavelets,Sketches [J] . Graham Cormode, Minos Garofalakis, Peter J.Haas, Foundations and trends in databases . 2011,第1a3期

机译：大量数据的摘要：样本，直方图，小波，草图
2. Massively Parallel Approximate Distance Sketches [J] . Michael Dinitz, Yasamin Nazari LIPIcs : Leibniz International Proceedings in Informatics . 2020,第30期

机译：大规模平行的近似距离草图
3. Brief Announcement: Massively Parallel Approximate Distance Sketches [J] . Michael Dinitz, Yasamin Nazari LIPIcs : Leibniz International Proceedings in Informatics . 2019,第1期

机译：简要公告：大规模平行的近似距离草图
4. Approximate Data Mining using Sketches for Massive Data [C] . Parul Gupta, Swati Agnihotri, Suman Saha International Conference on Computational Intelligence Modeling Techniques and Applications . 2014

机译：使用草图进行大规模数据的近似数据挖掘
5. Approximate algorithms for data warehousing and data mining. [D] . Wu, Xintao. 2001

机译：数据仓库和数据挖掘的近似算法。
6. HRT Atlas v1.0 database: redefining human and mouse housekeeping genes and candidate reference transcripts by mining massive RNA-seq datasets [O] . Bidossessi Wilfried Hounkpe, Francine Chenou, Franciele de Lima, 2021

机译：HRT ATLAS V1.0数据库：通过挖掘大规模RNA-SEQ数据集重新定义人员和鼠标管家基因和候选参考转录物
7. Approximate Data Mining Using Sketches for Massive Data [O] . Gupta Parul, Agnihotri Swati, Saha Suman 2013

机译：使用草图进行海量数据的近似数据挖掘
8. Cluster Analysis-Based Approaches for Geospatiotemporal Data Mining of Massive Data Sets for Identification of Forest Threats. [R] . Mills, R. T., Hoffman, F. M., Kumar, J., 2011

机译：基于聚类分析的海量数据集地理时空数据挖掘方法用于森林威胁识别。

Approximate Data Mining using Sketches for Massive Data

摘要

著录项

相似文献

相关主题

期刊订阅