首页> 外文期刊>Journal of Intelligent Information Systems >Formal concept analysis approach for data extraction from a limited deep web database
【24h】

Formal concept analysis approach for data extraction from a limited deep web database

机译:从有限的深度Web数据库提取数据的形式化概念分析方法

获取原文
获取原文并翻译 | 示例
       

摘要

Few studies have addressed the problem of extracting data from a limited deep web database. We apply formal concept analysis to this problem and propose a novel algorithm called EdaliwdbFCA. Before a query Y is sent, the algorithm analyzes the local formal context K_L, which consists of the latest extracted data, and predicts the size of the query results according to the cardinality of the extent X of the formal concept (X, Y) derived from K_L. Thus, it can be determined in advance if Y is a query or not. Candidate query concepts are dynamically generated from the lower cover of the current concept (X, Y). Therefore, this method avoids building of concrete concept lattices during extraction. Moreover, two pruning rules are adopted to reduce redundant queries. Experiments on controlled data sets and real applications were performed. The results confirm that the algorithm theories are correct and it can be effectively applied in the real world.
机译:很少有研究解决从有限的深度Web数据库提取数据的问题。我们对此问题应用形式概念分析,并提出了一种称为EdaliwdbFCA的新颖算法。在发送查询Y之前,该算法会分析包含最新提取数据的本地形式上下文K_L,并根据派生的形式概念(X,Y)的程度X的基数来预测查询结果的大小来自K_L。因此,可以预先确定Y是否是查询。候选查询概念是从当前概念的下封面(X,Y)动态生成的。因此,该方法避免了在提取过程中构建具体的概念格。此外,采用了两个修剪规则以减少冗余查询。在受控数据集和实际应用上进行了实验。实验结果表明,该算法理论是正确的,可以有效地应用于现实世界。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号