Focused Crawler Based on Domain Ontology and FCA

Zhenjiang Liu; Yajun Du; Ying Zhao

首页> 外文期刊>Journal of information and computational science >Focused Crawler Based on Domain Ontology and FCA

【24h】

Focused Crawler Based on Domain Ontology and FCA

机译：基于领域本体和FCA的集中爬虫

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Focused crawler is a web crawler that aims to selectively seeking out web pages which are relevant to a predefined set of crawling topics, instead of searching the whole Web exhaustively. In this paper, we propose an effective focused web crawling method which based on domain ontology and Formal Concept Analysis (FCA). The method construct a core similarity graph based on WordNet and concept relatedness firstly, and then combining with concept lattice knowledge, a Similarity Concept Context Graph (SCCG) is built. On the basis of SCCG, a focused web crawling method which can measure a page's expected relevancy to a given topic and determine which URL should be crawled firstly is proposed. Experimental result shows our approach has higher recall rates than the standard breadth-first approach, the approach with Context Graph (CG) and the approach with Relevancy Context Graph (RCG). In conclusion, the result demonstrates the effectiveness and significance of our approach.

机译：集中式爬虫是一种Web爬虫，其目的是选择性地查找与一组预定的爬网主题相关的网页，而不是穷举搜索整个Web。本文提出了一种基于领域本体和形式概念分析（FCA）的有效的集中式Web爬网方法。该方法首先基于WordNet和概念相关性构建了核心相似图，然后结合概念格知识，构建了相似概念上下文图（SCCG）。基于SCCG，提出了一种集中式Web爬网方法，该方法可以测量页面与给定主题的预期相关性并确定应首先爬网哪个URL。实验结果表明，与标准广度优先方法，使用上下文图（CG）和使用相关上下文图（RCG）的方法相比，我们的方法具有更高的召回率。总之，结果证明了我们方法的有效性和重要性。

著录项

来源
《Journal of information and computational science》 |2011年第10期|p.1909-1917|共9页
作者
Zhenjiang Liu; Yajun Du; Ying Zhao;
展开▼
作者单位

School of Mathematics and Computer Engineering, Xihua University, Chengdu 610039, China;

rnSchool of Mathematics and Computer Engineering, Xihua University, Chengdu 610039, China;

rnSchool of Mathematics and Computer Engineering, Xihua University, Chengdu 610039, China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
focused crawler; formal concept analysis; concept relatedness; wordnet;

机译：集中爬虫;形式概念分析;概念相关性;词网;

相似文献

外文文献
中文文献
专利

1. An ontology-driven multimedia focused crawler based on linked open data and deep learning techniques [J] . Andrea Capuano, Antonio M. Rinaldi, Cristiano Russo Multimedia Tools and Applications . 2020,第11a12期

机译：基于链接开放数据和深度学习技术的本体驱动的多媒体聚焦履带
2. Self-Adaptive Ontology based Focused Crawler for Social Bookmarking Sites [J] . Aamir Khan, Dilip Kumar Sharma International journal of information retrieval research . 2017,第2期

机译：基于自适应本体的社会书签站点聚焦爬虫
3. An approach for selecting seed URLs of focused crawler based on user-interest ontology [J] . YaJun Du, YuFeng Hai, ChunZhi Xie, Applied Soft Computing . 2014,第Pta3期

机译：一种基于用户兴趣本体的聚焦爬虫种子URL选择方法
4. The research of ontology-based focused crawler [C] . Wu Cong-Cong, Zhao Jian-li, Ma Hui-lin 2012 7th International Conference on System of Systems Engineering. . 2012

机译：基于本体的集中爬虫的研究
5. A semantic approach based on ontologies to support engineering knowledge retention and exchange in the product assembly design and training domains. [D] . Kim, Okjoon. 2011

机译：一种基于本体的语义方法，可在产品装配设计和培训领域中支持工程知识的保留和交换。
6. FFPred 3: feature-based function prediction for all Gene Ontology domains [O] . Domenico Cozzetto, Federico Minneci, Hannah Currant, -1

机译：FFPred 3：针对所有基因本体域的基于特征的功能预测
7. An Ontology-Based Focused Crawler [O] . Lefteris Kozanidis 2015

机译：基于Ontology的聚焦爬虫

Focused Crawler Based on Domain Ontology and FCA

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅