首页> 外国专利> Automatic generation of composite datasets based on hierarchical fields

Automatic generation of composite datasets based on hierarchical fields

机译:基于分层字段自动生成复合数据集

摘要

Datasets are annotated with metadata including categories. Each category corresponds to one or more fields. A hierarchy mapping is generated to indicate a hierarchical relationship between different categories. A natural language query specifies a first granularity level indicating a particular category and one or more field values corresponding to the particular category. Based on the hierarchy mapping, one or more categories that are hierarchically related to the particular category are identified. Based on the metadata, two or more datasets that include at least one hierarchically related category is selected. Based on the first granularity level, one or more dataset filters are generated. The one or more dataset filters are translated to a second granularity level corresponding to the at least one hierarchically related category. The translated filters are applied to at least one of the selected datasets. The two or more datasets are joined to generate a composite dataset.
机译:数据集带有包括类别的元数据注释。每个类别对应一个或多个字段。生成层次结构映射以指示不同类别之间的层次结构关系。自然语言查询指定指示特定类别的第一粒度级别以及与该特定类别相对应的一个或多个字段值。基于层次结构映射,标识与特定类别在层次上相关的一个或多个类别。基于元数据,选择包括至少一个分层相关类别的两个或多个数据集。基于第一粒度级别,生成一个或多个数据集过滤器。将一个或多个数据集过滤器转换为与至少一个分层相关类别相对应的第二粒度级别。转换后的过滤器将应用于至少一个选定的数据集。将两个或多个数据集合并以生成一个复合数据集。

著录项

  • 公开/公告号US10678860B1

    专利类型

  • 公开/公告日2020-06-09

    原文格式PDF

  • 申请/专利权人 PALANTIR TECHNOLOGIES INC.;

    申请/专利号US201615282780

  • 发明设计人 BEN DUFFIELD;PATRICK WOODY;RAHUL MEHTA;

    申请日2016-09-30

  • 分类号G06F16/9032;G06F16/248;G06F16/28;G06F16/2455;G06F16/2457;

  • 国家 US

  • 入库时间 2022-08-21 11:26:44

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号