首页> 外国专利> Discovering terms using statistical corpus analysis

Discovering terms using statistical corpus analysis

机译:使用统计语料库分析发现术语

摘要

Software that extracts contextually relevant terms from a text sample (or corpus) by performing the following steps: (i) identifying a first term from a corpus, based, at least in part, on a set of initial contextual characteristic(s), where each initial contextual characteristic of the set of initial contextual characteristic(s) relates to the contextual use of at least one category related term of a set of category related term(s) in the corpus; (ii) adding the first term to the set of category related term(s), thereby creating a revised set of category related term(s) and a set of first term contextual characteristic(s), where each first term contextual characteristic of the set of first term contextual characteristic(s) relates to the contextual use of the first term in the corpus; and (iii) identifying a second term from the corpus, based, at least in part, on the set of first term contextual characteristic(s).
机译:通过执行以下步骤从文本样本(或语料库)中提取上下文相关术语的软件:(i)至少部分地基于一组初始上下文特征,从语料库中识别第一个术语,其中一组初始上下文特征的每个初始上下文特征与语料库中一组类别相关术语中的至少一个类别相关术语的上下文使用有关; (ii)将第一术语添加到与类别相关的术语集合中,从而创建一组与类别相关的术语的修订集和一组第一术语上下文特征,其中第一项上下文特征的集合与语料库中第一项的上下文使用有关; (iii)至少部分地基于第一术语上下文特征的集合从语料库中识别第二术语。

著录项

  • 公开/公告号US10592605B2

    专利类型

  • 公开/公告日2020-03-17

    原文格式PDF

  • 申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;

    申请/专利号US201514722984

  • 发明设计人 JITENDRA AJMERA;ANKUR PARIKH;

    申请日2015-05-27

  • 分类号G06F16/27;G06F17/27;G06F16/35;G06F16/34;G06F16/33;

  • 国家 US

  • 入库时间 2022-08-21 11:30:31

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号