【24h】

Architecture of WebOntEx: A System For Semi-Automatic Extraction Of Ontologies From Web Pages

机译:WebOntEx的体系结构:一种用于从网页半自动提取本体的系统

获取原文
获取原文并翻译 | 示例

摘要

The Internet has been developing rapidly and the amount of information on the Internet is growing tremendously. To manage this huge growing data and to discover valuable information from it, incrementally extracting and managing metadata on the Web is needed. Hence, there is a need to extract conceptual structure (i.e., ontologies) of Web data and building a meta-database. The goal of the WebOntEx (Web Ontology Extraction) project is to extract Web ontologies semi-automatically by analyzing Web pages that are in the same application domain and to convert the ontology to XML DTD (Document Type Definition). The extracted ontologies can be used in various important applications, such as understanding Web information content, querying Web metadata, more intelligent Web searching, and conversion of unstructured/semistructured HTML Web pages to other formats, such as XML. The ontology is considered as a complete schema of the application domain concepts as they are used in the analyzed Web pages. The concepts are classified into entity types, relationships, attributes, and superclass/subclass hierarchies and stored in a relational database to allow them to evolve over time. We utilize machine-learning techniques, in particular inductive logic programming, in the WebOntEx Heuristic Analyzer module. This paper describes the WebOntEx project, and its architecture.
机译:互联网发展迅速,互联网上的信息量正在急剧增长。为了管理不断增长的庞大数据并从中发现有价值的信息,需要在Web上逐步提取和管理元数据。因此,需要提取Web数据的概念结构(即本体)并建立元数据库。 WebOntEx(Web本体提取)项目的目标是通过分析同一应用程序域中的网页来半自动提取Web本体,并将本体转换为XML DTD(文档类型定义)。提取的本体可以用于各种重要的应用程序中,例如理解Web信息内容,查询Web元数据,更智能的Web搜索以及将非结构化/半结构化HTML网页转换为其他格式(例如XML)。在分析的网页中使用本体时,本体被视为应用程序域概念的完整架构。这些概念分为实体类型,关系,属性和超类/子类层次结构,并存储在关系数据库中,以允许它们随时间演变。我们在WebOntEx启发式分析器模块中利用机器学习技术,尤其是归纳逻辑编程。本文介绍了WebOntEx项目及其体系结构。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号