首页>
外国专利>
Automatic discovery of relevant data in massive datasets
Automatic discovery of relevant data in massive datasets
展开▼
机译:自动发现海量数据集中的相关数据
展开▼
页面导航
摘要
著录项
相似文献
摘要
An approach for discovery of relevant data in massive datasets. Compare datasets including compare key fields, compare data fields and a core dataset including target data field(s) and core field(s) are received. The compare datasets are categorized into direct and indirect related dataset pools based on the target data field(s) correlation strength with matching compare and core fields. The direct related dataset pool and the core dataset are transformed into reduction datasets based on statistical measure of values of target data fields, shared key fields and compare data fields. Target correlations of the reduction datasets are creating based on a reduction compare and target data fields. Statistical relationship strength of core dataset and the direct related dataset pool are created based on a statistical mean of target correlations and a relevancy data store is created.
展开▼