Efficient sample selection in data stream regression employing evolving generalized fuzzy models

机译：使用演化的广义模糊模型进行数据流回归的有效样本选择

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In this paper, we propose two criteria for efficient sample selection in case of data stream regression problems. The selection becomes apparent whenever the target values, which guide the update of the regressors as well as the implicit model structures, are costly to measure. Reducing the samples used for model updates as much as possible while keeping the predictive accuracy of the models on a high level is thus a central challenge, especially in non-stationary environments where (permanent) system changes or expansion can be expected. Our selection criteria rely on two aspects: 1.) the extrapolation degree of the model combined with its non-linearity degree, 2.) the uncertainty in model outputs which can be measured in terms of confidence intervals reflected by so-called adaptive error bars, which are updated over time synchronously to the model. The selection criteria are developed in combination with evolving generalized Takagi-Sugeno (TS) fuzzy models (containing rules in arbitrarily rotated position), which could be shown to outperform conventional evolving TS models (containing axis-parallel rules) and other stream regression techniques in previous publications. The results based on two high-dimensional real-world streaming problems show that a decrease of the number of model updates by about 80-85% (as only 15-20% of samples are selected) can still achieve similar accumulated model errors over time to the case when performing a full update on all samples. This may yield a significant reduction of computational demands and of costs whenever targets are costly to measure.

机译：在本文中，我们提出了两个标准，以在数据流回归出现问题时有效地选择样本。每当指导回归变量和隐式模型结构更新的目标值的测量成本很高时，选择就变得显而易见。因此，尽可能地减少用于模型更新的样本，同时将模型的预测精度保持在较高水平上，这是一个主要挑战，尤其是在非平稳环境中（可能会发生（永久性）系统更改或扩展）。我们的选择标准取决于两个方面：1.）模型的外推度及其非线性度; 2.）模型输出中的不确定性，可以通过所谓的自适应误差线反映的置信区间来测量，这些信息会随着时间的推移与模型同步更新。选择标准是与不断发展的广义Takagi-Sugeno（TS）模糊模型（包含任意旋转位置的规则）相结合而开发的，可以证明其优于传统的不断发展的TS模型（包含轴平行规则）和其他流回归技术。以前的出版物。基于两个高维现实流问题的结果表明，随着时间的流逝，模型更新次数减少约80-85％（因为仅选择了15-20％的样本）仍可以实现类似的累积模型误差对所有样本执行完全更新时的情况。每当目标的测量成本很高时，这可能会大大减少计算需求和成本。

著录项

来源
《IEEE International Conference on Fuzzy Systems》|2015年|1-9|共9页
会议地点
作者
Lughofer Edwin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
data mining; extrapolation; fuzzy set theory; regression analysis; adaptive error bars; confidence intervals; data stream mining; data stream regression; evolving generalized Takagi-Sugeno fuzzy models; extrapolation degree; nonlinearity degree; Adaptation models; Computational modeling; Data mining; Data models; Fuzzy systems; Predictive models; Uncertainty; data stream regression; evolving generalized TS fuzzy systems; extrapolation degree; single-pass sample selection; uncertainty in model outputs;

机译：数据挖掘;外推;模糊集理论;回归分析;自适应误差线;置信区间;数据流挖掘;数据流回归;演化的广义Takagi-Sugeno模糊模型;外推度;非线性度;适应模型;计算建模;数据挖掘;数据模型;模糊系统;预测模型;不确定性;数据流回归;演化的广义TS模糊系统;外推度;单次通过样本选择;模型输出的不确定性;

相似文献

外文文献
中文文献
专利

1. Evolving fuzzy granular modeling from nonstationary fuzzy data streams [J] . Daniel Leite, Rosangela Ballini, Pyramo Costa, Evolving Systems . 2012,第2期

机译：非平稳模糊数据流的演化模糊粒度建模
2. Convergent Time-Varying Regression Models for Data Streams: Tracking Concept Drift by the Recursive Parzen-Based Generalized Regression Neural Networks [J] . Duda Piotr, Jaworski Maciej, Rutkowski Leszek International Journal of Neural Systems . 2018,第2期

机译：用于数据流的收敛时变回归模型：跟踪概念偏移由递归泊位的广义回归神经网络
3. Fuzzily Connected Multimodel Systems Evolving Autonomously From Data Streams [J] . Angelov P. Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on . 2011,第4期

机译：从数据流自主发展的模糊连接多模型系统
4. Efficient sample selection in data stream regression employing evolving generalized fuzzy models [C] . Lughofer Edwin IEEE International Conference on Fuzzy Systems . 2015

机译：采用演变的广义模糊模型的数据流回归中的高效样本选择
5. REGRESSION ANALYSIS WITH SELECTION BIASED DEPENDENT VARIABLE (TRUNCATED DATA, STRATIFIED SAMPLES, CENSORED, KAPLAN-MEIR ESTIMATE, SEMI-PARAMETRIC MODEL) [D] . WANG, MEI-CHENG. 1985

机译：具有选择偏置相关变量的回归分析（截断的数据，分层的样本，经过检查的，Kaplan-Meier估计，半参数模型）
6. The Stream Algorithm: Computationally Efficient Ridge-Regression via Bayesian Model Averaging and Applications to Pharmacogenomic Prediction of Cancer Cell Line Sensitivity [O] . Elias Chaibub Neto, In Sock Jang, Stephen H. Friend, -1

机译：流算法：通过贝叶斯模型平均计算有效的岭回归及其在癌细胞系敏感性药物基因组学预测中的应用
7. Online evolving fuzzy rule-based prediction model for high frequency trading financial data stream [O] . Gu Xiaowei, Angelov Plamen Parvanov, Mohd Ali Azliza, 2016

机译：基于高频进化模糊规则的高频交易金融数据流预测模型

Efficient sample selection in data stream regression employing evolving generalized fuzzy models

摘要

著录项

相似文献

相关主题

期刊订阅