首页> 外文会议>Annual Hawaii International Conference on System Sciences >Term Extraction and Disambiguation for Semantic Knowledge Enrichment: A Case Study on Initial Public Offering (IPO) Prospectus Corpus
【24h】

Term Extraction and Disambiguation for Semantic Knowledge Enrichment: A Case Study on Initial Public Offering (IPO) Prospectus Corpus

机译:关于语义知识丰富的术语提取和歧义:以初始公开发行(IPO)招股说明书的案例研究

获取原文

摘要

Domain knowledge bases are a basis for advanced knowledge-based systems, manually creating a formal knowledge base for a certain domain is both resource consuming and non-trivial. In this paper, we propose an approach that provides support to extract, select, and disambiguate terms embedded in domain specific documents. The extracted terms are later used to en-rich existing ontologies/taxonomies, as well as to bridge domain specific knowledge base with a generic knowledge base such as Word Net. The proposed approach addresses two major issues in the term extraction domain, namely quality and efficiency. Also, the proposed approach adopts a feature-based method that assists in topic extraction and integration with existing ontologies in the given domain. The proposed approach is realized in a research prototype, and then a case study is conducted in order to illustrate the feasibility and the efficiency of the proposed method in the finance domain. A preliminary empirical validation by the domain experts is also conducted to determine the accuracy of the proposed approach. The results from the case study indicate the advantages and potential of the proposed approach.
机译:域名知识库是先进知识的系统的基础,手动为某个域创建正式的知识库是资源消耗和非琐碎。在本文中,我们提出了一种方法,可以提供支持,选择嵌入域特定文档中的提取,选择和消除歧义。提取的术语后来用于富有的现有本体/分类,以及用诸如Word Net等通用知识库的域特定知识库。建议的方法解决了术语提取领域的两个主要问题,即质量和效率。此外,所提出的方法采用基于特征的方法,该方法有助于主题提取和与给定域中的现有本体的集成。所提出的方法是在研究原型中实现的,然后进行案例研究以说明拟议方法在金融域中的可行性和效率。还进行了域专家的初步实证验证,以确定所提出的方法的准确性。案例研究的结果表明所提出的方法的优缺点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号