A Fuzzy Set Theoretic approach to discover user sessions from web navigational data

机译：一种从网络导航数据发现用户会话的模糊集理论方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Due to the continuous increase in growth and complexity of WWW, web site publishers are facing increasing difficulty in attracting and retaining users. In order to design attractive web sites, designers must understand their users'' needs. Therefore analysing navigational behaviour of users is an important part of web page design. Web Usage Mining (WUM) is the application of data mining techniques to web usage data in order to discover the patterns that can be used to analyse the user''s navigational behaviour. Preprocessing, knowledge extraction and results analysis are the three main steps of WUM. Due to large amount of irrelevant information present in the web logs, the original log file can not be directly used in the WUM process. During the preprocessing stage of WUM raw web log data is to transformed into a set of user profiles. Each user profile captures a set of URLs representing a user session. This sessionized data can be used as the input for a variety of data mining tasks such as clustering, association rule mining, sequence mining etc. If the data mining task at hand is clustering, the session files are filtered to remove very small sessions in order to eliminate the noise from the data. But direct removal of these small sized sessions may result in loss of a significant amount of information specially when the number of small sessions is large. We propose a “Fuzzy Set Theoretic” approach to deal with this problem. Instead of directly removing all the small sessions below a specified threshold, we assign weights to all the sessions using a “Fuzzy Membership Function” based on the number of URLs accessed by the sessions. After assigning the weights we apply a “Fuzzy c-Mean Clustering” algorithm to discover the clusters of user profiles. In this paper, we provide a detailed review of various techniques to preprocess the web log data including data fusion, data cleaning, user identification and session identi--fication. We also describe our methodology to perform feature selection (or dimensionality reduction) and session weight assignment tasks. Finally we compare our soft computing based approach of session weight assignment with the traditional hard computing based approach of small session elimination.

机译：由于WWW的增长和复杂性的不断增加，网站发布者在吸引和保留用户方面面临越来越大的困难。为了设计有吸引力的网站，设计人员必须了解其用户的需求。因此，分析用户的导航行为是网页设计的重要组成部分。 Web用法挖掘（WUM）是将数据挖掘技术应用于Web用法数据，以便发现可用于分析用户导航行为的模式。预处理，知识提取和结果分析是WUM的三个主要步骤。由于Web日志中存在大量无关信息，因此原始日志文件不能直接在WUM流程中使用。在WUM的预处理阶段，原始Web日志数据将转换为一组用户配置文件。每个用户配置文件捕获代表用户会话的一组URL。此会话化的数据可用作各种数据挖掘任务的输入，例如聚类，关联规则挖掘，序列挖掘等。如果手头的数据挖掘任务是聚类的，则对会话文件进行过滤以按顺序删除非常小的会话消除数据中的噪音。但是，直接删除这些小型会话可能会导致大量信息丢失，尤其是在小型会话的数量很大时。我们提出了一种“模糊集理论”方法来解决这个问题。我们不是直接删除所有低于指定阈值的小型会话，而是根据会话访问的URL数量使用“模糊成员资格函数”为所有会话分配权重。分配权重后，我们应用“模糊c均值聚类”算法来发现用户配置文件的聚类。在本文中，我们详细介绍了用于预处理Web日志数据的各种技术，包括数据融合，数据清理，用户标识和会话标识- -- 功能。我们还描述了执行特征选择（或降维）和会话权重分配任务的方法。最后，我们将基于会话权重分配的基于软计算的方法与基于传统的基于硬计算的小会话消除方法进行了比较。

著录项

来源
《2011 IEEE Recent Advances in Intelligent Computational Systems》|2011年|p.879-884|共6页
会议地点
作者
Ansari Zahid; Vinaya Babuy A.; Ahmed Waseem; Mohammad Fazle Azeemz;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. A Conglomerate Relational Fuzzy Approach for Discovering Web User Session Clusters from Web Server Logs [J] . Dilip Singh Sisodia, Shrish Verma, Om Prakash Vyas International Journal of Engineering and Technology . 2016,第3期

机译：一种从Web服务器日志中发现Web用户会话群集的综合关系模糊方法
2. A Subtractive Relational Fuzzy C-Medoids Clustering Approach To Cluster Web User Sessions from Web Server Logs [J] . Dilip Singh Sisodia, Shrish Verma, Om Prakash Vyas International Journal of Applied Engineering Research . 2017,第7aPta1期

机译：从Web服务器日志群集Web用户会话的减法关系模糊C-METOIDS聚类方法
3. Fuzzy c-Least Medians clustering for discovery of web access patterns from web user sessions data [J] . Ansari Zahid, Faizabadi Ahmed Rimaz, Afzal Asif Intelligent data analysis . 2017,第3期

机译：模糊c-Least Medians聚类，用于从Web用户会话数据中发现Web访问模式
4. A Fuzzy Set Theoretic approach to discover user sessions from web navigational data [C] . Ansari Zahid, Vinaya Babuy A., Ahmed Waseem, IEEE Recent Advances in Intelligent Computational Systems . 2011

机译：一种模糊设置从Web导航数据发现用户会话的理论方法
5. A collaborative filtering approach to predict web pages of interest from navigation patterns of past users within an academic website. [D] . Nkweteyim, Denis Lemongew. 2005

机译：一种协作过滤方法，可根据学术网站内过去用户的导航模式来预测感兴趣的网页。
6. Measuring Polarization: A Fuzzy Set Theoretical Approach [O] . Juan Antonio Guevara, Daniel Gómez, José Manuel Robles, -1

机译：测量极化：一种模糊集理论方法
7. Using Agents for Concurrent Querying of Web-like Databases via a Hyper-Set-Theoretic Approach [O] . Vladimir Sazonov 2001

机译：通过超集理论方法将代理用于类似Web的数据库的并行查询

A Fuzzy Set Theoretic approach to discover user sessions from web navigational data

摘要

著录项

相似文献

相关主题

期刊订阅