首页> 外文学位 >A methodology for spatial and time series data mining and its applications.

【24h】

A methodology for spatial and time series data mining and its applications.

机译：空间和时间序列数据挖掘的方法及其应用。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this dissertation, we present several methodologies for mining spatial and time-sequence data obtained in diverse domains. We first propose a new spatial randomness test and classification method for binary spatial data with specific application to the detection and identification of spatial defect patterns on semiconductor wafer maps. We present the generalized join-count (JC)-based statistic as an alternative approach, and derive a procedure to determine the optimal weights of JC-based statistics. In the proposed methodology, a spatial correlogram, which transforms binary spatial data into time-sequence data, is used as a novel feature to detect spatial autocorrelation and classify spatial defect patterns on the wafer maps.;Secondly, we propose a novel distance measure, denoted weighted dynamic time warping (WDTW), for time series classification and clustering problems. The dynamic time warping (DTW) algorithm has been extensively used as a distance measure in combination with the distance-based classifiers. However, the DTW algorithm ignores the relative importance of the phase distance between points in a time series, possibly leading to misclassification. Therefore, we propose a WDTW distance measure which does account for the relative importance of each point in terms of the phase distance between the time series points.;Thirdly, we propose a wavelet-based anomaly detection procedure to detect any possible process fault with time-sequence data that have some local variations even under normal working conditions. To handle the large number of parameters in both the mean and variance models, we have developed the wavelet-based mean and variance thresholding procedure to extract a few important wavelet coefficients that may explain local variations in the time domain.;Finally, we propose a kernel-based regression with lagged dependent variables. Kernel-based regression techniques are extensively used for exploring the nonlinearity of data in a relatively easy procedure involving the use of various kernel functions. However, the major drawback of current kernel-based regression techniques is their underlying assumption that there is no autocorrelation in the residuals of observations. To avoid this problem, we propose a kernel-based regression model with lagged dependent variables (LDVs), considering autocorrelations of both the response variables and the nonlinearity of data.

机译：在本文中，我们提出了几种方法来挖掘在不同领域中获得的空间和时间序列数据。我们首先提出了一种新的针对二进制空间数据的空间随机性测试和分类方法，具体应用于半导体晶片图上空间缺陷图案的检测和识别。我们提出了基于通用联接计数（JC）的统计数据作为一种替代方法，并推导了确定基于JC的统计数据的最佳权重的过程。在所提出的方法中，空间相关图将二进制空间数据转换成时间序列数据，被用作检测空间自相关和对晶圆图上的空间缺陷图案进行分类的新功能。其次，我们提出了一种新颖的距离度量，表示加权动态时间规整（WDTW），用于时间序列分类和聚类问题。动态时间规整（DTW）算法已与基于距离的分类器结合广泛用作距离度量。但是，DTW算法忽略了时间序列中点之间的相距的相对重要性，这可能导致分类错误。因此，我们提出了一种WDTW距离度量，该度量确实根据时间序列点之间的相距考虑了每个点的相对重要性。第三，我们提出了一种基于小波的异常检测程序来检测随时间变化的任何可能的过程故障。 -即使在正常工作条件下，序列数据也有一些局部变化。为了处理均值和方差模型中的大量参数，我们开发了基于小波的均值和方差阈值处理程序，以提取一些重要的小波系数，这些系数可以解释时域中的局部变化。基于滞后因变量的基于内核的回归。基于核的回归技术被广泛用于以相对简单的过程来探索数据的非线性，其中涉及使用各种核函数。但是，当前基于核的回归技术的主要缺点是它们的基本假设，即观测值的残差中不存在自相关。为避免此问题，我们考虑了响应变量和数据的非线性之间的相关性，提出了一个具有滞后因变量（LDV）的基于内核的回归模型。

著录项

作者
Jeong, Young-Seon.;
展开▼
作者单位

Rutgers The State University of New Jersey - New Brunswick.;

展开▼
授予单位 Rutgers The State University of New Jersey - New Brunswick.;
学科 Engineering Industrial.
学位 Ph.D.
年度 2011
页码 157 p.
总页数 157
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Michael Hamacher, Martin Eisenacher, and Christian Stephan (eds). Data Mining in Proteomics: From Standards to Applications. Methods in Molecular Biology Series, Springer, Heidelberg, Germany; 2011, 432 pp, ISBN 978-1-60761-986-4; (hard cover) [J] . Luisa Rusconi Molecular Biotechnology . 2013,第3期

机译：Michael Hamacher，Martin Eisenacher和Christian Stephan（eds）。蛋白质组学中的数据挖掘：从标准到应用程序。分子生物学系列中的方法，施普林格，海德堡，德国； 2011，432pp，ISBN 978-1-60761-986-4; （精装书）
2. Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping [J] . THANAWIN RAKTHANMANON, BILSON CAMPANA, ABDULLAH MUEEN, ACM transactions on knowledge discovery from data . 2013,第3期

机译：解决大数据时间序列：动态时间规整下挖掘数千个时间序列子序列
3. Introducing time series chains: a new primitive for time series data mining [J] . Zhu Yan, Imamura Makoto, Nikovski Daniel, Knowledge and information systems . 2019,第2期

机译：推出时间序列链：时间序列数据挖掘的新原子
4. Temporal association rules mining: a heuristic methodology applied to Time Series Databases (TSDBs) [C] . CONTI DANTE, MARTINEZ DE PISON FRANCISCO J, PERNIA ALPHA Advances in computational intelligence, man-machine systems and cybernetics . 2010

机译：时间关联规则挖掘：应用于时间序列数据库（TSDB）的启发式方法
5. Pattern-Based Data Mining on Diverse Multimedia and Time Series Data. [D] . Campana, Bilson Jake. 2012

机译：基于多媒体和时间序列数据的基于模式的数据挖掘。
6. Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping [O] . THANAWIN RAKTHANMANON, BILSON CAMPANA, ABDULLAH MUEEN, -1

机译：解决大数据时间序列：动态时间规整下挖掘数千个时间序列子序列
7. Research on Vegetation Dynamic Change Simulation Based on Spatial Data Mining of ANN-CA Model Using Time Series of Remote Sensing Images [O] . Cai, Zhenyu, Wang, Xiaohua 2009

机译：基于时间序列的ANN-CA模型空间数据挖掘植被动态变化模拟研究

A methodology for spatial and time series data mining and its applications.

摘要

著录项

相似文献

相关主题

期刊订阅