首页> 外国专利> System and method for extracting information from text using text annotation and fact extraction

System and method for extracting information from text using text annotation and fact extraction

机译:使用文本注释和事实提取从文本中提取信息的系统和方法

摘要

A fact extraction tool set (“FEX”) finds and extracts targeted pieces of information from text using linguistic and pattern matching technologies, and in particular, text annotation and fact extraction. Text annotation tools break a text, such as a document, into its base tokens and annotate those tokens or patterns of tokens with orthographic, syntactic, semantic, pragmatic and other attributes. A user-defined “Annotation Configuration” controls which annotation tools are used in a given application. XML is used as the basis for representing the annotated text. A tag uncrossing tool resolves conflicting (crossed) annotation boundaries in an annotated text to produce well-formed XML from the results of the individual annotators. The fact extraction tool is a pattern matching language which is used to write scripts that find and match patterns of attributes that correspond to targeted pieces of information in the text, and extract that information.
机译:事实提取工具集(“ FEX”)使用语言和模式匹配技术,特别是文本注释和事实提取,从文本中查找和提取目标信息。文本注释工具将诸如文档之类的文本分解成其基本标记,并用正字法,句法,语义,语用和其他属性注释这些标记或标记模式。用户定义的“注释配置”控制在给定应用程序中使用哪些注释工具。 XML被用作表示带注释文本的基础。标签非交叉工具可解决带注释文本中冲突(交叉)的注释边界,从而根据各个注释器的结果生成格式正确的XML。事实提取工具是一种模式匹配语言,用于编写脚本来查找和匹配与文本中目标信息相对应的属性模式,然后提取该信息。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号