首页> 外国专利> Systems and methods for structural indexing of natural language text

Systems and methods for structural indexing of natural language text

机译:自然语言文本的结构化索引系统和方法

摘要

A structural natural language index is created by segmenting documents within a repository into text portions and extracting named entity, co-reference, lexical entries, structural-semantic relationships, speaker attribution and meronymic derived features. A constituent structure is determined that contains the constituent elements and ordering information sufficient to reconstruct the text portion. A functional structure of the text portions is determined. A set of characterizing predicative triples are formed from the functional structure by applying linearization transfer rules. The constituent structure, the characterizing predicative triples and the derived features are combined to form a canonical form of the text portion. Each canonical form is added to the structural natural language index. A retrieved question is classified to determine question type and a corresponding canonical form for the question is generated. The entries in the structural natural language index are searched for entries matching the canonical form of the question and relevant to the question type. The characterizing predicative triples are used in conjunction with a generation grammar to create an answer. If the generation fails, some or all of the constituent structure of the matching entry is returned as the answer.
机译:通过将存储库中的文档分段为文本部分并提取命名实体,共同引用,词法条目,结构语义关系,说话者归因和代名词衍生特征来创建结构自然语言索引。确定构成结构,该结构包含足以重构文本部分的构成元素和排序信息。确定文本部分的功能结构。通过应用线性化传递规则,从功能结构中形成一组表征性谓词三元组。组成结构,特征谓词三元组和派生特征被组合以形成文本部分的规范形式。每个规范形式都添加到结构自然语言索引中。对检索到的问题进行分类以确定问题类型,并为该问题生成相应的规范形式。在结构自然语言索引中的条目中搜索与该问题的规范形式匹配且与该问题类型相关的条目。表征谓词三元组与生成语法结合使用以创建答案。如果生成失败,则返回匹配条目的部分或全部组成结构作为答案。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号