首页> 外文学位 >Supervised and unsupervised machine learning for pattern recognition and time series prediction.
【24h】

Supervised and unsupervised machine learning for pattern recognition and time series prediction.

机译:有监督和无监督机器学习,用于模式识别和时间序列预测。

获取原文
获取原文并翻译 | 示例

摘要

The problem of empirical data modeling relates to many engineering applications, such as classification, prediction, and pattern recognition. In Chapter 1 I will introduce Machine Learning and Data Mining approaches from a Computer Science and Statistics perspective. I have developed a new clustering method DBBIRCH (Density Based BIRCH) that combines the features of density- and distance-based clustering algorithms. This method is described in Chapter 2 and is based upon (Bean K, 2007). My algorithm is an on-line type of algorithm and has a running time asymptotically equal to BIRCH under some realistic assumptions. To improve the accuracy of "distance-based" algorithms, robust statistics (trimmed mean) are used. The density-based feature of this algorithm is achieved by combining initial clusters into networks of density-connected clusters. DBBIRCH provides a fast and precise clustering method to mapping data points to their non-spherical clusters. My algorithm is easily modified to perform parallel clustering of large datasets using grid computing. My prototype program used breast cancer (UCI Machine Learning Repository) and synthetic datasets to support my conclusions.;I have developed a new framework to improve the performance of a partition-typed algorithm for the clustering of datasets with missing attributes. Chapter 3 describes this framework, and this approach is based on (Bean K., 2008). I have incorporated CLARA, PAM and K-means within a framework that remains general enough to allow other clustering algorithms to be used. Initial clustering is performed using a very fast algorithm: BIRCH. This approach was implemented to determine input parameters for a more accurate algorithm and to make the prediction of missing attributes more efficiently.;Using a neural network model for flood predictions is one of the most popular approaches. This technique, however, has a drawback related to the uncertainty of an optimal structure. I propose an algorithm for neural network pruning to create a Neural Network with Auto- and Cross-Correlation Models (NN-ACC). I believe this approach can determine the best neural network input. A forecasting framework for the presented NN-ACC model is constructed to perform calculations for a real-world case study (Derwent catchment of Upper Derwent). According to (Dunham M., 2004), NN-ACC gives a much better result than EMM and RLF.
机译:经验数据建模的问题涉及许多工程应用,例如分类,预测和模式识别。在第一章中,我将从计算机科学和统计学的角度介绍机器学习和数据挖掘方法。我开发了一种新的聚类方法DBBIRCH(基于密度的BIRCH),该方法结合了基于密度和距离的聚类算法的功能。此方法在第2章中进行了描述,并基于(Bean K,2007)。我的算法是一种在线算法,在一些现实的假设下,其运行时间渐近等于BIRCH。为了提高“基于距离”算法的准确性,使用了可靠的统计信息(均值修整后的平均值)。该算法的基于密度的功能是通过将初始群集组合成密度连接群集的网络来实现的。 DBBIRCH提供了一种快速而精确的聚类方法,可以将数据点映射到它们的非球形聚类。我的算法很容易修改,以使用网格计算对大型数据集进行并行聚类。我的原型程序使用了乳腺癌(UCI机器学习存储库)和综合数据集来支持我的结论。我已经开发了一个新的框架,以提高分区类型算法对缺少属性的数据集聚类的性能。第3章介绍了此框架,该方法基于(Bean K.,2008)。我将CLARA,PAM和K-means合并到一个框架中,该框架足够通用,可以使用其他聚类算法。初始聚类使用非常快速的算法BIRCH执行。实施该方法是为了确定输入参数,以使用更准确的算法,并更有效地进行缺少属性的预测。;使用神经网络模型进行洪水预测是最受欢迎的方法之一。但是,该技术具有与最佳结构的不确定性有关的缺点。我提出了一种用于神经网络修剪的算法,以创建具有自相关和互相关模型(NN-ACC)的神经网络。我相信这种方法可以确定最佳的神经网络输入。构建了针对所提出的NN-ACC模型的预测框架,以执行针对实际案例研究(上德文特的德文特流域)的计算。根据(Dunham M.,2004),NN-ACC的结果要比EMM和RLF好得多。

著录项

  • 作者

    Bean, Kathryn Brenda.;

  • 作者单位

    The University of Texas at Dallas.;

  • 授予单位 The University of Texas at Dallas.;
  • 学科 Statistics.;Computer Science.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 96 p.
  • 总页数 96
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 统计学;自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号