首页> 外国专利> Natural language relatedness tool using mined semantic analysis

Natural language relatedness tool using mined semantic analysis

机译:使用挖掘的语义分析的自然语言相关性工具

摘要

Mined semantic analysis techniques (MSA) include generating a first subset of concepts, from a NL corpus, that are latently associated with an NL candidate term based on (i) a second subset of concepts from the corpus that are explicitly or implicitly associated with the candidate term and (ii) a set of concept association rules. The concept association rules are mined from a transaction dictionary constructed from the corpus and defining discovered latent associations between corpus concepts. A concept space of the candidate term includes at least portions of both the first and second subset of concepts, and includes indications of relationships between latently-associated concepts and the explicitly/implicitly-associated concepts from which the latently-associated concepts were derived. Measures of relatedness between candidate terms are deterministically determined based on their respective concept spaces. Example corpora include digital corpora such as encyclopedias, journals, intellectual property datasets, health-care related datasets/records, financial-sector related datasets/records, etc.
机译:挖掘语义分析技术(MSA)包括根据(i)来自语料库的与主题显式或隐式关联的概念的第二子集,从NL语料库生成与NL候选词潜在关联的概念的第一子集。候选词和(ii)一组概念关联规则。概念关联规则是从由语料库构建并定义语料库概念之间发现的潜在关联的事务字典中挖掘的。候选术语的概念空间包括概念的第一子集和第二子集的至少一部分,并且包括潜在关联概念与从中导出潜在关联概念的显式/隐式关联概念之间的关系的指示。根据候选术语之间的相关性,确定性地确定候选术语之间的相关性。示例语料库包括数字语料库,例如百科全书,期刊,知识产权数据集,与医疗保健有关的数据集/记录,与金融部门有关的数据集/记录等。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号