Mining Taxonomies from Web Menus: Rule-Based Concepts and Algorithms

机译：从Web菜单中挖掘分类法：基于规则的概念和算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The logical hierarchies of Web sites (i.e. Web site taxonomies) are obvious to humans, because humans can distinguish different menu levels and their relationships. But such accurate information about the logical structure is not yet available to machines. Many applications would benefit if Web site taxonomies could be mined from menus, but it was an almost unsolvable problem in the past. While a tag newly introduced in HTML5 and novel mining methods allow to distinguish menus from other contents today, it has not yet been researched, how the underlying taxonomies can be extracted, given the menus. In this paper we present the first detailed analysis of the problem and introduce rule-based concepts for addressing each identified sub problem. We report on a large-scale study on mining hierarchical menus of 350 randomly selected domains. Our methods allow extracting Web site taxonomy information that was not available before with high precision and high recall.

机译：网站的逻辑层次结构（即网站分类法）对于人类来说是显而易见的，因为人类可以区分不同的菜单级别及其关系。但是这样的关于逻辑结构的准确信息尚不能用于机器。如果可以从菜单中挖掘网站分类法，那么许多应用程序都将从中受益，但这在过去几乎是无法解决的问题。尽管HTML5中新引入的标签和新颖的挖掘方法如今可以将菜单与其他内容区分开，但尚未对其进行研究，即如何在给定菜单的情况下提取基本分类法。在本文中，我们对问题进行了首次详细分析，并介绍了基于规则的概念来解决每个已识别的子问题。我们报告了一项关于对350个随机选择的域的分层菜单进行挖掘的大规模研究的报告。我们的方法允许以高精度和高召回率提取以前无法获得的网站分类信息。

著录项

来源
《International conference on web engineering》|2013年|265-282|共18页
会议地点
作者
Matthias Keller; Hannes Hartenstein;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Web site taxonomies; Web mining; Content hierarchies;

机译：网站分类法;网络挖掘;内容层次结构;

相似文献

外文文献
中文文献
专利

1. Using the Clustering Algorithms and Rule-based of Data Mining to Identify Affecting Factors in the Profit and Loss of Third Party Insurance, Insurance Company Auto [J] . Faramarz Karamizadeh, Seyed Ahad Zolfagharifar Indian Journal of Science and Technology . 2016,第7期

机译：使用聚类算法和基于规则的数据挖掘识别第三方保险公司汽车保险损益的影响因素
2. Using the Clustering Algorithms and Rule-based of Data Mining to Identify Affecting Factors in the Profit and Loss of Third Party Insurance, Insurance Company Auto [J] . Faramarz Karamizadeh, Seyed Ahad Zolfagharifar Indian Journal of Science and Technology . 2016,第7期

机译：使用聚类算法和基于规则的数据挖掘识别第三方保险公司汽车保险损益的影响因素
3. Facility layout using weighted association rule-based data mining algorithms:Evaluation with simulation [J] . Serkan Altuntas, Hasan Selim Expert systems with applications . 2012,第1期

机译：使用基于加权关联规则的数据挖掘算法进行设施布局：仿真评估
4. Mining Taxonomies from Web Menus: Rule-Based Concepts and Algorithms [C] . Matthias Keller, Hannes Hartenstein Internationla Conference on Web Engineering . 2013

机译：来自Web菜单的挖掘分类学：基于规则的概念和算法
5. A detailed study on Web mining concepts and implementation of personalized proxy server for effective Web browsing. [D] . Tanniru, Srinivas. 2005

机译：有关Web挖掘概念和个性化代理服务器实现有效Web浏览的详细研究。
6. Fast Adapting Ensemble: A New Algorithm for Mining Data Streams with Concept Drift [O] . Agustín Ortíz Díaz, José del Campo-Ávila, Gonzalo Ramos-Jiménez, 2015

机译：快速适应的集成体：一种使用概念漂移挖掘数据流的新算法
7. AN OPTIMIZED PAGE RANK ALGORITHM WITH WEB MINING, WEB CONTENT MINING AND WEB STRUCTURE MINING [O] . Kwame Agyapong, J.B.Hayfron Acquah, M. Asante 2020

机译：具有Web挖掘，Web内容挖掘和Web结构挖掘的优化页面排名算法

Mining Taxonomies from Web Menus: Rule-Based Concepts and Algorithms

摘要

著录项

相似文献

相关主题

期刊订阅