首页> 外国专利> USING NATURAL LANGUAGE PROCESSING (NLP) TO CREATE SUBJECT MATTER SYNONYMS FROM DEFINITIONS

USING NATURAL LANGUAGE PROCESSING (NLP) TO CREATE SUBJECT MATTER SYNONYMS FROM DEFINITIONS

机译:使用自然语言处理(NLP)从定义中创建主题问题

摘要

Methods, apparatus and systems, including computer program products, for creating subject matter synonyms from definitions extracted from a subject matter glossary. Confidence scores, each representing a likelihood that two terms defined in the subject matter glossary are synonyms, are determined by applying natural language processing (e.g., passage term matching, lexical matching, and syntactic matching) to the extracted definitions. A subject matter thesaurus is built based on the confidence scores. In one embodiment, a statement containing a first term is created based on an extracted definition of the first term, a modified statement is created by substituting a second term in the statement in lieu of the first term, a corpus is searched, and a confidence score is determined based on evidence in the corpus that the modified statement is accurate. The first and second terms are marked as synonyms if the confidence score is greater than a threshold.
机译:用于根据从主题词汇表中提取的定义来创建主题同义词的方法,装置和系统,包括计算机程序产品。通过将自然语言处理(例如,通过项匹配,词法匹配和句法匹配)应用于所提取的定义,来确定置信度得分,每个置信度得分表示主题词汇表中定义的两个术语是同义词的可能性。基于置信度得分建立主题词库。在一个实施例中,基于提取的第一项的定义来创建包含第一项的陈述,通过代替第二项在陈述中替代第二项来创建修改的陈述,搜索语料库,并置信度分数是根据语料库中经过修改的陈述是准确的证据确定的。如果置信度得分大于阈值,则将第一和第二项标记为同义词。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号