首页> 外文学位 >Proxy Relearning for Feature-Driven Pattern Recognition in High-Dimensional Imbalanced Time Series Data Sets

【24h】

Proxy Relearning for Feature-Driven Pattern Recognition in High-Dimensional Imbalanced Time Series Data Sets

机译：高维不平衡时间序列数据集中特征驱动模式识别的代理重新学习

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This thesis explores the possibility of feature-driven time series pattern recognition from both practical and theoretical perspectives for predictive modelling in a situation where data are imbalanced, minority class examples are scarce, the ratio of feature dimension to sample size is high, and the class labels provided might not be optimized for the application. These problems are common in learning patient-specific patterns in medical and health domains, where labels provided by medical experts might not fit the goal of predictive modelling. Extracting informative labels for supervised learning is a difficult and time-consuming task. A novel strategy is proposed to solve the problems mentioned above, which aims to reduce human effort by automatically finding the earliest pattern that a classifier can recognize. The proposed algorithm locates and learns similar patterns across training examples that maximize the difference between both classes. This method ensures precise learning and boosts the performance of classifier by reducing the number of false positives. The performance of the algorithm was evaluated based on the classification results and the anticipation responses on the data provided by EPILEPSIAE, a European Epilepsy Database. An average false positive of 0.0519 per hour was achieved using the proposed algorithm with a sensitivity of 0.79 in anticipating seizures.

机译：本文从数据和数据不平衡，少数类样本稀少，特征维数与样本量之比高，类别分类的情况下，从实践和理论角度探讨了预测模型的特征驱动时间序列模式识别的可能性。提供的标签可能未针对该应用进行优化。这些问题在学习医学和健康领域中特定于患者的模式时很常见，在这种情况下，医学专家提供的标签可能不符合预测建模的目标。提取信息标签以进行有监督的学习是一项困难且耗时的任务。为了解决上述问题，提出了一种新颖的策略，旨在通过自动找到分类器可以识别的最早模式来减少人工。所提出的算法可以在训练示例中找到并学习相似的模式，从而使两个类别之间的差异最大化。该方法可确保精确学习，并通过减少误报次数来提高分类器的性能。基于分类结果和欧洲癫痫数据库EPILEPSIAE提供的数据的预期响应，评估了算法的性能。使用所提出的算法，每小时平均假阳性为0.0519，预期癫痫发作的敏感性为0.79。

著录项

作者
Cho, Wilfred Yau-Chuen.;
展开▼
作者单位

University of Toronto (Canada).;

展开▼
授予单位 University of Toronto (Canada).;
学科 Engineering.;Artificial intelligence.;Bioinformatics.
学位 M.A.S.
年度 2017
页码 77 p.
总页数 77
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-17 11:39:13

相似文献

外文文献
中文文献
专利

1. Improving SVM classification on imbalanced time series data sets with ghost points [J] . Suzan Koknar-Tezel, Longin Jan Latecki Knowledge and information systems . 2011,第1期

机译：使用重影点改善不平衡时间序列数据集的SVM分类
2. Improving SVM classification on imbalanced time series data sets with ghost points [J] . Suzan Köknar-Tezel, Longin Jan Latecki Knowledge and Information Systems . 2011,第1期

机译：使用重影点改善不平衡时间序列数据集的SVM分类
3. Pattern recognition in time series database: A case study on financial database [J] . Yan-Ping Huang, Chung-Chian Hsu, Sheng-Hsuan Wang Expert systems with applications . 2007,第1期

机译：时间序列数据库中的模式识别：以金融数据库为例
4. STFMap: Query- and Feature-Driven Visualization of Large Time Series Data Sets [C] . K. Selguk Candan, Rosaria Rossini, Maria Luisa Sapino, ACM international conference on information and knowledge management . 2012

机译：STFMap：大时间序列数据集的查询和功能驱动的可视化
5. A new framework for pattern recognition of time-series data. [D] . An, Daewon. 2004

机译：时间序列数据模式识别的新框架。
6. Hardware Failure Prediction on Imbalanced Times Series Data: [O] . Nadine Rücker, Lea Pflüger, Andreas Maier 2021

机译：混合时间序列数据的硬件故障预测：
7. STATE-OF-THE-ART AND EVOLUTION IN PUBLIC DATA SETS AND COMPETITIONS FOR SYSTEM IDENTIFICATION, TIME SERIES PREDICTION AND PATTERN RECOGNITION [O] . Joos V, Johan Suykens, Bart De Moor, 2013

机译：公共数据集的最新现状和演变以及系统识别，时间序列预测和模式识别的竞争
8. Generalized Feature Extraction for Structural Pattern Recognition in Time-Series Data [R] . Olszewski, R. T. 2001

机译：时间序列数据中结构模式识别的广义特征提取

Proxy Relearning for Feature-Driven Pattern Recognition in High-Dimensional Imbalanced Time Series Data Sets

摘要

著录项

相似文献

相关主题

期刊订阅