首页> 外国专利> APPARATUS AND METHOD FOR SEARCHING INFORMATION BASED ON WIKIPEDIA'S CONTENTS

APPARATUS AND METHOD FOR SEARCHING INFORMATION BASED ON WIKIPEDIA'S CONTENTS

机译:基于维基百科内容的信息搜索装置和方法

摘要

The present invention relates to an information searching apparatus based on the Wikipedia content and an information searching method. An embodiment of the present invention provides the information searching apparatus based on the Wikipedia content comprises: a document conversion unit extracting a body document, a section title document, an info-box document, a category document and a definition statement document from the original text of the Wikipedia to generate one or more Wikipedia documents for response to a query; a document indexing unit analyzing the Wikipedia documents for response to the query to extract index words of a word class unit from the Wikipedia documents for response to the query and generating the index of the Wikipedia documents for response to the query; a query analyzing unit receiving a natural language query, analyzing a query type, an answer type and a query focus and extracting document search keywords; a document searching unit performing a document search from the Wikipedia documents for response to the query using the document search keywords and generating a document search result from the index of the Wikipedia documents for response to the query; an answer extracting unit extracting a first answer from the document search result using information on the query type, the answer type and the query focus; and an answer integration unit integrating the first answer and ranking the first answer to generate a second answer.
机译:基于维基百科内容的信息搜索设备和信息搜索方法技术领域本发明涉及基于维基百科内容的信息搜索设备和信息搜索方法。本发明实施例提供了一种基于维基百科内容的信息搜索装置,包括:文档转换单元,从原始文本中提取正文文档,章节标题文档,信息框文档,类别文档和定义声明文档。 Wikipedia生成一个或多个Wikipedia文档以响应查询;文档索引单元,对所述维基百科文档进行响应查询,以从所述维基百科文档中提取词类单元的索引词以响应查询,并生成所述维基百科文档的索引以响应所述查询。查询分析单元,接收自然语言查询,分析查询类型,答案类型和查询重点,提取文档搜索关键词;文档搜索单元使用文档搜索关键字从维基百科文档中执行文档搜索以响应查询,并从维基百科文档的索引中生成文档搜索结果以响应查询;答案提取单元使用关于查询类型,答案类型和查询焦点的信息从文档搜索结果中提取第一答案;答案整合单元,整合第一答案并对第一答案进行排名,以产生第二答案。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号