...
首页> 外文期刊>Computer Science and Information Systems >Research on Discovering Deep Web Entries
【24h】

Research on Discovering Deep Web Entries

机译:发现深度Web条目的研究

获取原文

摘要

Ontology plays an important role in locating Domain-Specific Deep Web contents, therefore, this paper presents a novel framework WFF for efficiently locating Domain-Specific Deep Web databases based on focused crawling and ontology by constructing Web Page Classifier(WPC), Form Structure Classifier(FSC) and Form Content Classifier(FCC) in a hierarchical fashion. Firstly, WPC discovers potentially interesting pages based on ontology-assisted focused crawler. Then, FSC analyzes the interesting pages and determines whether these pages subsume searchable forms based on structural characteristics. Lastly, FCC identifies searchable forms that belong to a given domain in the semantic level, and stores these URLs of Domain-Specific searchable forms to a database. Through a detailed experimental evaluation, WFF framework not only simplifies discovering process, but also effectively determines Domain-Specific databases.
机译:本体在定位特定领域的深度Web内容中起着重要的作用,因此,本文提出了一种新颖的框架WFF,该框架通过构造Web页面分类器(WPC),表单结构分类器来有效地基于集中的爬网和本体来定位特定领域的深度Web数据库。 (FSC)和表单内容分类器(FCC)以分层方式进行。首先,WPC基于本体辅助的集中搜寻器发现了潜在有趣的页面。然后,FSC分析感兴趣的页面,并根据结构特征确定这些页面是否包含可搜索形式。最后,FCC在语义级别上识别属于给定域的可搜索形式,并将这些特定于域的可搜索形式的URL存储到数据库中。通过详细的实验评估,WFF框架不仅简化了发现过程,而且有效地确定了特定领域的数据库。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号