首页>
外国专利>
GENERATING EFFICIENT SAMPLING STRATEGY PROCESSING FOR BUSINESS DATA RELEVANCE CLASSIFICATION
GENERATING EFFICIENT SAMPLING STRATEGY PROCESSING FOR BUSINESS DATA RELEVANCE CLASSIFICATION
展开▼
机译:为业务数据相关性分类生成有效的采样策略处理
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method for performing efficient data sampling across a storage stack for training machine learning (ML) models. The method includes obtaining, by a processor, data. The processor clusters the data into clusters based on similarities of the obtained data across an entire storage stack including: storage infrastructure metrics, file metrics and application dependency taxonomy. The processor performs a random sampling process to sample representative data from each cluster. The sampled representative data are combined to generate training data for processing predictive analytics.
展开▼