首页> 外国专利> GENERATING EFFICIENT SAMPLING STRATEGY PROCESSING FOR BUSINESS DATA RELEVANCE CLASSIFICATION

GENERATING EFFICIENT SAMPLING STRATEGY PROCESSING FOR BUSINESS DATA RELEVANCE CLASSIFICATION

机译：为业务数据相关性分类生成有效的采样策略处理

页面导航

摘要
著录项
相似文献

摘要

A method for performing efficient data sampling across a storage stack for training machine learning (ML) models. The method includes obtaining, by a processor, data. The processor clusters the data into clusters based on similarities of the obtained data across an entire storage stack including: storage infrastructure metrics, file metrics and application dependency taxonomy. The processor performs a random sampling process to sample representative data from each cluster. The sampled representative data are combined to generate training data for processing predictive analytics.

机译：一种用于跨存储堆栈执行有效数据采样以训练机器学习（ML）模型的方法。该方法包括由处理器获得数据。处理器基于在整个存储堆栈上获得的数据的相似性，将数据聚类为群集，包括：存储基础架构指标，文件指标和应用程序依赖分类法。处理器执行随机采样过程以从每个群集中采样代表数据。采样的代表数据被合并以生成用于处理预测分析的训练数据。

著录项

公开/公告号US2017140297A1

专利类型
公开/公告日2017-05-18

原文格式PDF
申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;
展开▼

申请/专利号US201514943915
发明设计人 SUSHAMA KARUMANCHI;SUNHWAN LEE;MU QIAO;RAMANI R. ROUTRAY;
展开▼

申请日2015-11-17
分类号G06N99;
国家 US
入库时间 2022-08-21 13:51:18

相似文献

专利
外文文献
中文文献