首页> 外文会议>International Conference on Web Information Systems Engineering >Adaptive Focused Crawling of Linked Data

【24h】

Adaptive Focused Crawling of Linked Data

机译：连接数据的自适应聚焦爬网

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Given the evolution of publicly available Linked Data, crawling and preservation have become increasingly important challenges. Due to the scale of available data on the Web, efficient focused crawling approaches which are able to capture the relevant semantic neighborhood of seed entities are required. Here, determining relevant entities for a given set of seed entities is a crucial problem. While the weight of seeds within a seed list vary significantly with respect to the crawl intent, we argue that an adaptive crawler is required, which considers such characteristics when configuring the crawling and relevance detection approach. To address this problem, we introduce a crawling configuration, which considers seed list-specific features as part of its crawling and ranking algorithm. We evaluate it through extensive experiments in comparison to a number of baseline methods and crawling parameters. We demonstrate that, configurations which consider seed list features outperform the baselines and present further insights gained from our experiments.

机译：鉴于公开可用的数据的演变，爬行和保护已经变得越来越重要。由于网络上的可用数据的规模，需要能够捕获种子实体相关语义邻域的有效聚焦爬网方法。这里，确定给定种子实体的相关实体是一个至关重要的问题。虽然种子列表中的种子的重量相对于爬行意图而变化显着变化，但我们认为需要一种自适应履带，这在配置爬网和相关检测方法时考虑了这些特性。为了解决这个问题，我们介绍了一种爬网配置，其认为种子列表特定的特征是其爬网和排名算法的一部分。与许多基线方法和爬行参数相比，我们通过广泛的实验进行评估。我们证明，考虑种子列表特征的配置优于基线，并提供我们实验中获得的进一步洞察力。

著录项

来源
《International Conference on Web Information Systems Engineering 》|2015年||共16页
会议地点
作者
Ran Yu; Ujwal Gadiraju; Besnik Fetahu; Stefan Dietze;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391-532;
关键词
Focused crawling; Linked data; Relevance assessment;

机译：聚焦爬行;链接数据;相关性评估;

相似文献

外文文献
中文文献
专利

1. An adaptive focused Web crawling algorithm based on learning automata [J] . Javad Akbari Torkestani Applied Intelligence . 2012 ,第4期

机译：基于学习自动机的自适应聚焦Web爬行算法
2. An adaptive focused Web crawling algorithm based on learning automata [J] . Javad Akbari Torkestani Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2012 ,第4期

机译：基于学习自动机的自适应聚焦Web爬行算法
3. On-line topical importance estimation: an effective focused crawling algorithm combining link and content analysis [J] . Can WANG, Zi-yu GUAN, Chun CHEN, Journal of Zhejiang University. Science, A . 2009 ,第8期

机译：在线主题重要性估计：结合链路和内容分析的有效聚焦爬网算法
4. Adaptive Focused Crawling of Linked Data [C] . Ran Yu, Ujwal Gadiraju, Besnik Fetahu, International conference on web information systems engineering . 2015

机译：链接数据的自适应聚焦爬网
5. Connecting link structure and content on the Web for effective focused crawling. [D] . Nickerson, Adam Stuart. 2003

机译：连接Web上的链接结构和内容，以进行有效的集中爬网。
6. An Efficient Approach for Web Indexing of Big Data through Hyperlinks in Web Crawling [O] . R. Suganya Devi, D. Manjula, R. K. Siddharth 2015

机译：通过Web爬网中的超链接对大数据进行Web索引的一种有效方法
7. Scalable, Generic, And Adaptive Systems For Focused Crawling [O] . Gouriten, G, Senellart, P, Maniu, S 2014

机译：用于聚焦爬行的可扩展，通用和自适应系统
8. Focused Crawling of the Deep Web Using Service Class Descriptions [R] . Rocco, D., Liu, L., Critchlow, T. 2005

机译：使用服务类描述重点对Deep Web进行爬网

Adaptive Focused Crawling of Linked Data

摘要

著录项

相似文献

相关主题

期刊订阅