首页> 外文学位 >A methodology for spatial and time series data mining and its applications.
【24h】

A methodology for spatial and time series data mining and its applications.

机译:空间和时间序列数据挖掘的方法及其应用。

获取原文
获取原文并翻译 | 示例

摘要

In this dissertation, we present several methodologies for mining spatial and time-sequence data obtained in diverse domains. We first propose a new spatial randomness test and classification method for binary spatial data with specific application to the detection and identification of spatial defect patterns on semiconductor wafer maps. We present the generalized join-count (JC)-based statistic as an alternative approach, and derive a procedure to determine the optimal weights of JC-based statistics. In the proposed methodology, a spatial correlogram, which transforms binary spatial data into time-sequence data, is used as a novel feature to detect spatial autocorrelation and classify spatial defect patterns on the wafer maps.;Secondly, we propose a novel distance measure, denoted weighted dynamic time warping (WDTW), for time series classification and clustering problems. The dynamic time warping (DTW) algorithm has been extensively used as a distance measure in combination with the distance-based classifiers. However, the DTW algorithm ignores the relative importance of the phase distance between points in a time series, possibly leading to misclassification. Therefore, we propose a WDTW distance measure which does account for the relative importance of each point in terms of the phase distance between the time series points.;Thirdly, we propose a wavelet-based anomaly detection procedure to detect any possible process fault with time-sequence data that have some local variations even under normal working conditions. To handle the large number of parameters in both the mean and variance models, we have developed the wavelet-based mean and variance thresholding procedure to extract a few important wavelet coefficients that may explain local variations in the time domain.;Finally, we propose a kernel-based regression with lagged dependent variables. Kernel-based regression techniques are extensively used for exploring the nonlinearity of data in a relatively easy procedure involving the use of various kernel functions. However, the major drawback of current kernel-based regression techniques is their underlying assumption that there is no autocorrelation in the residuals of observations. To avoid this problem, we propose a kernel-based regression model with lagged dependent variables (LDVs), considering autocorrelations of both the response variables and the nonlinearity of data.
机译:在本文中,我们提出了几种方法来挖掘在不同领域中获得的空间和时间序列数据。我们首先提出了一种新的针对二进制空间数据的空间随机性测试和分类方法,具体应用于半导体晶片图上空间缺陷图案的检测和识别。我们提出了基于通用联接计数(JC)的统计数据作为一种替代方法,并推导了确定基于JC的统计数据的最佳权重的过程。在所提出的方法中,空间相关图将二进制空间数据转换成时间序列数据,被用作检测空间自相关和对晶圆图上的空间缺陷图案进行分类的新功能。其次,我们提出了一种新颖的距离度量,表示加权动态时间规整(WDTW),用于时间序列分类和聚类问题。动态时间规整(DTW)算法已与基于距离的分类器结合广泛用作距离度量。但是,DTW算法忽略了时间序列中点之间的相距的相对重要性,这可能导致分类错误。因此,我们提出了一种WDTW距离度量,该度量确实根据时间序列点之间的相距考虑了每个点的相对重要性。第三,我们提出了一种基于小波的异常检测程序来检测随时间变化的任何可能的过程故障。 -即使在正常工作条件下,序列数据也有一些局部变化。为了处理均值和方差模型中的大量参数,我们开发了基于小波的均值和方差阈值处理程序,以提取一些重要的小波系数,这些系数可以解释时域中的局部变化。基于滞后因变量的基于内核的回归。基于核的回归技术被广泛用于以相对简单的过程来探索数据的非线性,其中涉及使用各种核函数。但是,当前基于核的回归技术的主要缺点是它们的基本假设,即观测值的残差中不存在自相关。为避免此问题,我们考虑了响应变量和数据的非线性之间的相关性,提出了一个具有滞后因变量(LDV)的基于内核的回归模型。

著录项

  • 作者

    Jeong, Young-Seon.;

  • 作者单位

    Rutgers The State University of New Jersey - New Brunswick.;

  • 授予单位 Rutgers The State University of New Jersey - New Brunswick.;
  • 学科 Engineering Industrial.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 157 p.
  • 总页数 157
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号