【24h】

Analyzing Semi-structured Data For Ontological Information Extraction

机译:分析半结构化数据以进行本体信息提取

获取原文
获取原文并翻译 | 示例

摘要

The goal of the WebOntEx (Web Ontology Extraction) project is to extract Web ontologies semi-automatically by analyzing Web pages that are in the same application domain. The ontology is considered as a complete schema of the application domain concepts as they are used in the analyzed Web pages. The concepts are classified into entity types, relationships, attributes, and superclass/subclass hierarchies and stored in a relational database to allow them to evolve over time. This paper describes the WebOntEx project, and its architecture and main component module. We utilize machine-learning techniques, in particular inductive logic programming, in the WebOntEx Heuristic Analyzer module. The extracted ontologies can be used in various important applications, such as understanding Web information content, querying Web meta-data, more intelligent Web searching, and conversion of HTML Web pages to other formats, such as XML.
机译:WebOntEx(Web本体提取)项目的目标是通过分析同一应用程序域中的网页来半自动提取Web本体。在分析的网页中使用本体时,本体被视为应用程序域概念的完整架构。这些概念分为实体类型,关系,属性和超类/子类层次结构,并存储在关系数据库中,以允许它们随时间演变。本文介绍了WebOntEx项目,其体系结构和主要组件模块。我们在WebOntEx启发式分析器模块中利用机器学习技术,尤其是归纳逻辑编程。提取的本体可以用于各种重要的应用程序中,例如理解Web信息内容,查询Web元数据,更智能的Web搜索以及将HTML Web页面转换为其他格式(例如XML)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号