首页> 外国专利> A computer method for identifying predicate-argument structures in natural language text

A computer method for identifying predicate-argument structures in natural language text

机译:识别自然语言文本中谓语-自变量结构的计算机方法

摘要

A computer method is disclosed for determining predicate-argument structures in input prose sentences of English. The input sentence, in the form of a string of words separated by blanks, is first analyzed (parsed) by a rule component that has access only to morphological and syntactic information about the words. The output of this rule component, in the form of a data structure consisting of attribute-value pairs, is then processed by the argument-structure component, which consists of a set of partially ordered procedures that incorporate further linguistic knowledge. The output of these procedures is the same attribute-value structure, now enhanced by the presence of semantic (i.e., meaningful, non-syntactic) attributes. These semantic attributes, taken together, form the argument structure of the input sentence.;The resulting invention constitutes a fully modular, comprehensive and efficient method for passing from syntax to the first stage of semantic processing of natural (human) language. The invention applies to all prose sentences of the language for which it is designed, and not just to a subset of those sentences. It does not use domain-specific semantic information to improve the accuracy or efficiency of the syntactic component. It therefore constitutes an unrestricted broad-coverage method for natural language processing (NLP), as opposed to the restricted methods used in most NLP applications today.;Although the specific rules and procedures will be different for different natural languages, the general concept embodied in this invention is applicable to all natural languages.
机译:公开了一种用于确定英语的输入散文句子中的谓词-自变量结构的计算机方法。首先用规则组件分析(解析)以空格分隔的单词字符串形式的输入句子,该规则组件只能访问有关单词的词法和句法信息。然后,由参数-结构组件处理该规则组件以由属性-值对组成的数据结构形式的输出,该参数结构组件由一组包含更多语言知识的部分有序过程组成。这些过程的输出是相同的属性-值结构,现在由于存在语义(即有意义的,非语法上的)属性而得到增强。这些语义属性加在一起,形成输入句子的论证结构。所产生的发明构成了一种从语义到自然(人类)语言语义处理的第一阶段的完全模块化,全面而有效的方法。本发明适用于其所设计的语言的所有散文句子,而不仅是这些句子的子集。它不使用特定于领域的语义信息来提高语法组件的准确性或效率。因此,它构成了自然语言处理(NLP)的一种不受限制的广泛覆盖方法,与当今大多数NLP应用程序中使用的受限制的方法相反。尽管不同的自然语言的特定规则和过程将有所不同,但其基本概念体现在本发明适用于所有自然语言。

著录项

  • 公开/公告号EP0413132A3

    专利类型

  • 公开/公告日1993-03-31

    原文格式PDF

  • 申请/专利权人 INTERNATIONAL BUSINESS MACHINESCORPORATION;

    申请/专利号EP19900113224

  • 发明设计人 JENSEN KAREN;

    申请日1990-07-11

  • 分类号G06F15/38;G06F15/20;

  • 国家 EP

  • 入库时间 2022-08-22 05:06:19

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号