首页> 外国专利> GENERATING EFFICIENT SAMPLING STRATEGY PROCESSING FOR BUSINESS DATA RELEVANCE CLASSIFICATION

GENERATING EFFICIENT SAMPLING STRATEGY PROCESSING FOR BUSINESS DATA RELEVANCE CLASSIFICATION

机译:为业务数据相关性分类生成有效的采样策略处理

摘要

A method for performing efficient data sampling across a storage stack for training machine learning (ML) models. The method includes obtaining, by a processor, data. The processor clusters the data into clusters based on similarities of the obtained data across an entire storage stack including: storage infrastructure metrics, file metrics and application dependency taxonomy. The processor performs a random sampling process to sample representative data from each cluster. The sampled representative data are combined to generate training data for processing predictive analytics.
机译:一种用于跨存储堆栈执行有效数据采样以训练机器学习(ML)模型的方法。该方法包括由处理器获得数据。处理器基于在整个存储堆栈上获得的数据的相似性,将数据聚类为群集,包括:存储基础架构指标,文件指标和应用程序依赖分类法。处理器执行随机采样过程以从每个群集中采样代表数据。采样的代表数据被合并以生成用于处理预测分析的训练数据。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号