首页> 外国专利> EXTRACTING INFORMATION FROM UNSTRUCTURED DATA AND MAPPING THE INFORMATION TO A STRUCTURED SCHEMA USING THE NAÏVE BAYESIAN PROBABILITY MODEL

EXTRACTING INFORMATION FROM UNSTRUCTURED DATA AND MAPPING THE INFORMATION TO A STRUCTURED SCHEMA USING THE NAÏVE BAYESIAN PROBABILITY MODEL

机译:使用朴素贝叶斯概率模型从非结构化数据中提取信息并将其映射到结构化架构

摘要

An “unstructured event parser” analyzes an event that is in unstructured form and generates an event that is in structured form. A mapping phase determines, for a given event token, possible fields of the structured event schema to which the token could be mapped and the probabilities that the token should be mapped to those fields. Particular tokens are then mapped to particular fields of the structured event schema. By using the Naïve Bayesian probability model, a “probabilistic mapper” determines, for a particular token and a particular field, the probability that that token maps to that field. The probabilistic mapper can also be used in a “regular expression creator” that generates a regex that matches an unstructured event and a “parameter file creator” that helps a user create a parameter file for use with a parameterized normalized event generator to generate a normalized event based on an unstructured event.
机译:“非结构化事件解析器”分析非结构化形式的事件,并生成结构化形式的事件。对于给定的事件令牌,映射阶段确定令牌可以映射到的结构化事件模式的可能字段以及令牌应映射到这些字段的概率。然后,将特定令牌映射到结构化事件模式的特定字段。通过使用朴素贝叶斯概率模型,“概率映射器”为特定令牌和特定字段确定该令牌映射到该字段的概率。概率映射器还可用于“正则表达式创建器”和“参数文件创建器”中,“正则表达式创建器”生成与非结构化事件匹配的正则表达式,“参数文件创建器”可帮助用户创建参数文件,以与参数化规范化事件生成器一起使用以生成规范化事件生成器基于非结构化事件的事件。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号