首页> 外国专利> Retrieving Text from a Corpus of Documents in an Information Handling System

Retrieving Text from a Corpus of Documents in an Information Handling System

机译：在信息处理系统中从文档语料库中检索文本

页面导航

摘要
著录项
相似文献

摘要

A mechanism is provided for retrieving candidate answers from a corpus of documents. The mechanism receives an input question for which an answer is sought. The mechanism extracts features of the input question based on a natural language processing. The mechanism executes a first search of the corpus of documents based on a first subset of the extracted features of the input question and an initial evaluation of a utility of the first subset of extracted features to generate a subset of documents. The mechanism executes a second search of a set of passages extracted from the subset of documents based on a second subset of the extracted features of the input question and a reevaluation of the utility of the second subset of extracted features thereby forming a subset of passages. The mechanism generates query results from the subset of passages matching from which candidate answers are identified.

机译：提供了一种用于从文档语料库中检索候选答案的机制。该机制接收输入问题，为其寻求答案。该机制基于自然语言处理来提取输入问题的特征。该机制基于输入问题的所提取特征的第一子集和对所提取特征的第一子集的效用的初始评估来执行文档语料库的第一搜索，以生成文档子集。该机制基于输入问题的提取特征的第二子集和对提取特征的第二子集的效用的重新评估，对从文档子集提取的一段段落执行第二次搜索，从而形成段落的子集。该机制从匹配的段落子集生成查询结果，从中识别出候选答案。

著录项

公开/公告号US2016055234A1

专利类型
公开/公告日2016-02-25

原文格式PDF
申请/专利权人 INTERNATIONAL BUSINESS MACHINES CORPORATION;
展开▼

申请/专利号US201414462662
发明设计人 WILLIAM G. VISOTSKI;DAVID E. WILSON;
展开▼

申请日2014-08-19
分类号G06F17/30;G06N7/00;G06N5/04;G06N99/00;
国家 US
入库时间 2022-08-21 14:35:13

相似文献

专利
外文文献
中文文献