首页> 外文会议>2011 Fifth IEEE International Conference on Semantic Computing >Retrieval of Patent Documents from Heterogeneous Sources Using Ontologies and Similarity Analysis
【24h】

Retrieval of Patent Documents from Heterogeneous Sources Using Ontologies and Similarity Analysis

机译:使用本体和相似度分析从异构来源检索专利文件

获取原文

摘要

In the past few years, there has been an explosive growth in scientific and legal information related to the patent system. Patents and related documents are siloed into multiple heterogeneous sources. Retrieving relevant information from diverse sources is a non-trivial task and poses many technical challenges. Among the challenges is the issue of terminological inconsistencies that are used in the documents. We tackle the terminological inconsistency issue by exploring domain knowledge through the use of ontology standards. Furthermore, we take advantage of cross-references and structural dependencies between the information sources to enhance terminological comparison. In this paper, we present a similarity analysis methodology which combines knowledge from two distinct sources -- (1) domain ontologies and (2) ontologies which describe the information sources to assist a user in identifying relevant documents across several information sources simultaneously. Specifically, we explore the use of a rule-based system to infer relationships between documents based on pre-defined heuristics. We present our results through a use case in the bio-patent domain with a collection of 1150 patents and 30 court cases.
机译:在过去的几年中,与专利制度有关的科学和法律信息有了爆炸性的增长。专利和相关文件被分成多个不同的来源。从各种来源检索相关信息是一项艰巨的任务,并带来许多技术挑战。挑战之一是文档中使用的术语不一致问题。我们通过使用本体标准探索领域知识来解决术语不一致的问题。此外,我们利用信息源之间的交叉引用和结构依赖性来增强术语比较。在本文中,我们提出了一种相似性分析方法,该方法结合了来自两个不同来源的知识-(1)领域本体和(2)本体描述信息源,以帮助用户同时识别多个信息源中的相关文档。具体来说,我们探索使用基于规则的系统基于预定义的启发式方法推断文档之间的关系。我们通过在生物专利领域中的一个用例展示了我们的结果,该用例包含1150项专利和30个法院案件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号