首页> 外文学位 >Computer Aided Patent Processing: Natural Language Processing, Machine Learning, and Information Retrieval
【24h】

Computer Aided Patent Processing: Natural Language Processing, Machine Learning, and Information Retrieval

机译:计算机辅助专利处理:自然语言处理,机器学习和信息检索

获取原文
获取原文并翻译 | 示例

摘要

The intellectual property economy, and, more narrowly, the patent economy, form an incredibly wide-reaching and important part of the economic activity of the United States, and of the broader world. The patent ecosystem involves a diverse collection of players and interests. Inventors conceive of an idea, then patent agents and attorneys help them author and defend/refine a patent application, interacting with patent examiners at the patent offices who review it, checking for novelty, prior art, and usefulness (or industrial applicability). Once the patent is issued, and even sometimes while it is still in the application phase, middlemen companies may buy and sell it. A company owning a patent may then request that other companies license its use. Standards organizations, such as ETSI 3GPP or IEEE, are also important participants in the lifecycle of many important patents, since often certain patents are required in order to implement standards, and hence license agreement structures, such as the FRAND (fair reasonable and non-discriminatory) agreement, are set up to coordinate their licensing for use. Finally, if a company decides that another company is likely infringing on its patent, it may bring a case, which the federal courts must then hear.;Today each of these problems are done by humans with minimal, near non-existent use of any custom software or algorithmic tools designed specifically to help with these tasks. The work carried out in this dissertation is part of a broader intellectual agenda which aims to address this deficiency. In order to maximize novelty, this dissertation selects two problems that are especially overlooked. One is extracting deep meaning from patent claims. The other one is using deep meaning in recommender systems mapping between patents and standards. Furthermore, in order to maximize impact and to kickstart further research in these problems, the work in this dissertation focuses on the creation and curation of datasets germane to these two problems, as well as the creation and evaluation of prototype baseline systems to solve them. In particular, a dataset of grammar annotations of patent claims is firstly curated via Amazon Mechanical Turk. Then a baseline natural language processing system is trained and evaluated for extracting deep meaning from patent claims on this dataset, showing that substantial improvements over existing techniques for extracting this deep meaning are possible by leveraging this new dataset. Next, two ground truth datasets associating patent claims and sections of standards from information provided in intellectual property rights (IPR) disclosures to the European Telecommunications Standards Institute (ETSI) are extracted and curated. Following that, a new machine learning based retrieval system is designed for mapping between patent claims and standards. Finally, the new machine learning based retrieval system is evaluated on both datasets and their subsets and reveals substantial improvements comparing with SVM and retrieval baseline systems.
机译:知识产权经济,以及更为狭义的专利经济,构成了美国乃至整个世界经济活动中极为广泛和重要的组成部分。专利生态系统涉及参与者和利益的多样化集合。发明人构想出一个主意,然后专利代理人和律师帮助他们创作和捍卫/完善专利申请,与专利局的专利审查员进行互动,审查专利,检查新颖性,现有技术和实用性(或工业适用性)。专利一经发布,甚至在尚处于申请阶段的时候,中间商公司就可以买卖该专利。拥有专利的公司然后可以要求其他公司许可其使用。 ETSI 3GPP或IEEE等标准组织也是许多重要专利生命周期的重要参与者,因为通常需要某些专利才能实施标准,因此需要许可协议结构,例如FRAND(合理且非歧视性协议),以协调其使用许可。最后,如果一家公司判定另一家公司可能侵犯了其专利,那么它可能会提起诉讼,联邦法院必须随后审理此案;如今,这些问题都是由人类解决的,几乎没有使用任何一种专为帮助完成这些任务而设计的定制软件或算法工具。本论文所进行的工作是旨在解决这一缺陷的更广泛的思想议程的一部分。为了最大化新颖性,本文选择了两个特别被忽视的问题。一种是从专利权利要求中提取深层含义。另一种是在专利与标准之间的推荐系统中使用深层含义。此外,为了使影响最大化并开始对这些问题的进一步研究,本论文的工作重点是与这两个问题密切相关的数据集的创建和管理,以及解决这些问题的原型基线系统的创建和评估。特别是,首先通过Amazon Mechanical Turk整理了专利权利要求的语法注释数据集。然后,对基本自然语言处理系统进行了培训和评估,以从该数据集的专利权利要求中提取深层含义,这表明,利用现有的新数据集,可以对提取深层含义的现有技术进行重大改进。接下来,提取并整理了两个将专利权利要求和标准部分与从知识产权(IPR)披露中提供给欧洲电信标准协会(ETSI)的信息相关联的地面事实数据集。随后,设计了一种新的基于机器学习的检索系统,用于在专利权利要求和标准之间进行映射。最后,新的基于机器学习的检索系统在数据集及其子集上进行了评估,与SVM和检索基准系统相比,显示出了显着的改进。

著录项

  • 作者

    Hu, Mengke.;

  • 作者单位

    Drexel University.;

  • 授予单位 Drexel University.;
  • 学科 Electrical engineering.;Information science.;Artificial intelligence.;Intellectual property.
  • 学位 Ph.D.
  • 年度 2017
  • 页码 153 p.
  • 总页数 153
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号