Dynamic data mining on multi-dimensional data.

机译：在多维数据上的动态数据挖掘。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

The generation of multi-dimensional data has proceeded at an explosive rate in many disciplines with the advance of modern technology, which greatly increases the challenges of comprehending and interpreting the resulting mass of data. Existing data analysis techniques have difficulty in handling multi-dimensional data. Multi-dimensional data has been a challenge for data analysis because of the inherent sparsity of the points.; A first step toward addressing this challenge is the use of clustering techniques, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. Cluster analysis is used to identify homogeneous and well-separated groups of objects in databases. The need to cluster large quantities of multi-dimensional data is widely recognized. It is a classical problem in the database, artificial intelligence, and theoretical literature, and plays an important role in many fields of business and science.; There are also a lot of approaches designed for outlier detection. In many situations, clusters and outliers are concepts whose meanings are inseparable to each other, especially for those data sets with noise. Thus, it is necessary to treat clusters and outliers as concepts of the same importance in data analysis.; It is well acknowledged that in the real world a large proportion of data has irrelevant features which may cause a reduction in the accuracy of some algorithms. High dimensional data sets continue to pose a challenge to clustering algorithms at a very fundamental level. One of the well known techniques for improving the data analysis performance is the method of dimension reduction which is often used in clustering, classification, and many other machine learning and data mining applications.; Many approaches have been proposed to index high-dimensional data sets for efficient querying. Although most of them can efficiently support nearest neighbor search for low dimensional data sets, they degrade rapidly when dimensionality goes higher. Also the dynamic insertion of new data can cause original structures no longer handle the data sets efficiently since it may greatly increase the amount of data accessed for a query.; In this dissertation, we study the problems mentioned above. We proposed a novel data pre-processing technique called shrinking which optimizes the inner structure of data inspired by Newton's Universal Law of Gravitation in the real world. We then proposed a shrinking-based clustering algorithm for multi-dimensional data and extended the algorithm to the dimension reduction field, resulting in a shrinking-based dimension reduction algorithm. (Abstract shortened by UMI.)

机译：随着现代技术的发展，多维数据的生成在许多学科中都以爆炸性的速度进行，这极大地增加了理解和解释最终数据量的挑战。现有的数据分析技术难以处理多维数据。由于点固有的稀疏性，多维数据一直是数据分析的挑战。解决这一挑战的第一步是使用聚类技术，这在数据挖掘过程中必不可少，以揭示自然结构并识别基础数据中有趣的模式。聚类分析用于识别数据库中对象的均质和分隔良好的组。人们普遍认识到需要对大量多维数据进行聚类。这是数据库，人工智能和理论文献中的经典问题，并且在商业和科学的许多领域中发挥着重要作用。还有许多用于离群值检测的方法。在许多情况下，聚类和离群值是彼此含义不可分离的概念，尤其是对于那些带有噪声的数据集。因此，有必要将聚类和离群值视为在数据分析中同样重要的概念。众所周知，在现实世界中，很大一部分数据具有不相关的特征，这可能会导致某些算法的准确性降低。高维数据集在非常基础的水平上继续对聚类算法构成挑战。改进数据分析性能的一种众所周知的技术是降维方法，该方法通常用于聚类，分类以及许多其他机器学习和数据挖掘应用程序中。已经提出了许多方法来索引高维数据集以进行有效的查询。尽管它们中的大多数可以有效地支持对低维数据集的最近邻居搜索，但是当维数变高时，它们会迅速退化。动态插入新数据也可能导致原始结构不再有效地处理数据集，因为这可能会大大增加查询访问的数据量。本文研究了上述问题。我们提出了一种称为收缩的新颖数据预处理技术，该技术可以优化牛顿在现实世界中的万有引力定律启发下的数据内部结构。然后，我们针对多维数据提出了一种基于收缩的聚类算法，并将其扩展到降维领域，从而得到了基于收缩的降维算法。（摘要由UMI缩短。）

著录项

作者
Shi, Yong.;
展开▼
作者单位

State University of New York at Buffalo.;

展开▼
授予单位 State University of New York at Buffalo.;
学科 Computer Science.
学位 Ph.D.
年度 2006
页码 229 p.
总页数 229
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Dynameomics: a multi-dimensional analysis-optimized database for dynamic protein data. [J] . Kehl C, Simms AM, Toofanny RD, Protein engineering design & selection: PEDS . 2008,第6期

机译：Dynameomics：动态蛋白质数据的多维分析优化数据库。
2. Toward intelligent data warehouse mining: An ontology-integrated approach for multi-dimensional association mining [J] . Chin-Ang Wu, Wen-Yang Lin, Chang-Long Jiang, Expert Systems with Application . 2011,第9期

机译：迈向智能数据仓库挖掘：多维关联挖掘的本体集成方法
3. yMGV: a database for visualization and data mining of published genome-wide yeast expression data. [J] . Marc P iologie.ens.fr, Devaux F, Jacq C Nucleic Acids Research . 2001,第13期

机译：yMGV：一个数据库，用于可视化和数据挖掘已发布的全基因组酵母表达数据。
4. Modification of search space (in network data) by user interaction with a Virtual reality (VR) representation of a network: The use of VRDM (Virtual Reality Data Mining) for rapid, accurate, mining of network data. [C] . K.E. Burn-Thornton, C. Radix International Conference on Data Mining . 2002

机译：通过与网络的虚拟现实（VR）表示的用户交互来修改搜索空间（在网络数据中）：使用VRDM（虚拟现实数据挖掘）以快速，准确，挖掘网络数据。
5. Mining Dynamic Relationships From Spatio-temporal Datasets: An Application to Brain fMRI Data. [D] . Atluri, Gowtham. 2014

机译：从时空数据集中挖掘动态关系：应用于脑功能磁共振成像数据。
6. RiboMiner: a toolset for mining multi-dimensional features of the translatome with ribosome profiling data [O] . Fajin Li, Xudong Xing, Zhengtao Xiao, 2020

机译：Ribominer：具有核糖体分析数据的翻译多维特征的工具集
7. Interactive data mining and visualization on multi-dimensional data. [O] . 1999

机译：交互式数据挖掘和多维数据可视化。

Dynamic data mining on multi-dimensional data.

摘要

著录项

相似文献

相关主题

期刊订阅