首页> 外国专利> PRODUCING TRAINING SETS FOR MACHINE LEARNING METHODS BY PERFORMING DEEP SEMANTIC ANALYSIS OF NATURAL LANGUAGE TEXTS

PRODUCING TRAINING SETS FOR MACHINE LEARNING METHODS BY PERFORMING DEEP SEMANTIC ANALYSIS OF NATURAL LANGUAGE TEXTS

机译:通过对自然语言文本进行深入的语义分析来生产机器学习方法的训练集

摘要

Systems and methods for producing training sets for machine learning methods by performing deep semantic analysis of natural language texts. An example method comprises: performing a lexico-morphological analysis of a natural language text comprising a plurality of tokens, to determine one or more lexical and grammatical attributes associated with each token of the plurality of tokens, each token comprising at least one natural language word; performing a syntactico-semantic analysis of the natural language text to produce a plurality of syntactico-semantic structures representing the natural language text; determining, using the syntactico-semantic structures, a plurality of syntactic and semantic attributes associated with the natural language text; selecting, among the lexical, grammatical, syntactic and semantic attributes, a set of output attributes; and producing an output text comprising symbolic identifiers of one or more attributes of the output set of attributes, wherein each attribute is associated with a corresponding part of the natural language text.
机译:通过对自然语言文本进行深度语义分析来生成机器学习方法的训练集的系统和方法。一种示例方法包括:对包括多个令牌的自然语言文本执行词汇形态分析,以确定与多个令牌中的每个令牌相关联的一个或多个词汇和语法属性,每个令牌包括至少一个自然语言单词;对自然语言文本进行句法语义分析,以产生表示自然语言文本的多个句法语义结构;使用句法语义结构,确定与自然语言文本关联的多个句法和语义属性;从词汇,语法,句法和语义属性中选择一组输出属性;产生输出文本,该输出文本包括输出属性集的一个或多个属性的符号标识符,其中每个属性与自然语言文本的相应部分相关联。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号