首页> 外国专利> Determining an extraction rule from positive and negative examples

Determining an extraction rule from positive and negative examples

机译:从正面和负例确定提取规则

摘要

The technology disclosed relates to formulating and refining field extraction rules that are used at query time on raw data with a late-binding schema. The field extraction rules identify portions of the raw data, as well as their data types and hierarchical relationships. These extraction rules are executed against very large data sets not organized into relational structures that have not been processed by standard extraction or transformation methods. By using sample events, a focus on primary and secondary example events help formulate either a single extraction rule spanning multiple data formats, or multiple rules directed to distinct formats. Selection tools mark up the example events to indicate positive examples for the extraction rules, and to identify negative examples to avoid mistaken value selection. The extraction rules can be saved for query-time use, and can be incorporated into a data model for sets and subsets of event data.
机译:所公开的技术涉及在具有延迟绑定模式的原始数据上的查询时间使用的制定和精炼的现场提取规则。字段提取规则识别原始数据的部分,以及它们的数据类型和分层关系。这些提取规则是针对没有由标准提取或转换方法处理的不组织成关系结构的非常大的数据集。通过使用样本事件,对主示例事件的焦点有助于制定生长多个数据格式的单个提取规则,或者多个针对不同格式的规则。选择工具标记示例事件以指示提取规则的正示例,并识别否定示例以避免错误的值选择。可以保存提取规则以用于查询时间使用,并且可以合并到数据模型中,用于集合和事件数据的子集。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号