首页> 外国专利> Method for mining path traversal patterns in a web environment by converting an original log sequence into a set of traversal sub-sequences

Method for mining path traversal patterns in a web environment by converting an original log sequence into a set of traversal sub-sequences

机译:通过将原始日志序列转换为一组遍历子序列来挖掘Web环境中路径遍历模式的方法

摘要

An efficient computer implemented method of mining path traversal patterns in a communications network. The method of the present invention comprises two steps. A method, called MF (standing for maximal forward references), is first used to convert an original sequence of log data into a set of traversal subsequences. Each traversal subsequence represents a maximal forward reference from the starting point of a user access. This step of converting the original log sequence into a set of maximal forward references will filter out the effect of backward references which are mainly made for ease of traveling, and enable us to concentrate on mining meaningful user access sequences. Accordingly, when backward references occur, a forward reference path terminates. This resulting forward reference path is termed a maximal forward reference. After a maximal forward reference is obtained, we back track to the starting point of the forward reference and begin a new forward reference path. In addition, the occurrence of a null source node also indicates the termination of an ongoing forward reference path and the beginning of a new one. Second, methods are developed to determine the frequent traversal patterns, termed large reference sequences, from the maximal forward references obtained above, where a large reference sequence is a reference sequence that appeared a sufficient number of times in the database to exceed a predetermined threshold.
机译:一种有效的计算机实现的方法,用于挖掘通信网络中的路径遍历模式。本发明的方法包括两个步骤。首先使用一种称为MF(代表最大前向引用)的方法将原始日志数据序列转换为一组遍历子序列。每个遍历子序列代表从用户访问的起点开始的最大前向参考。将原始日志序列转换为一组最大前向引用的步骤将滤除后向引用的影响,这些影响主要是为了方便旅行而进行的,并使我们能够集中精力挖掘有意义的用户访问序列。因此,当发生后向参考时,前向参考路径终止。该产生的前向参考路径被称为最大前向参考。获得最大的前向参考后,我们会回溯到前向参考的起点,并开始新的前向参考路径。另外,空源节点的出现还指示正在进行的前向参考路径的终止和新路径的开始。其次,开发了从上面获得的最大前向参考确定频繁遍历模式的方法,称为大参考序列,其中大参考序列是在数据库中出现足够次数超过预定阈值的参考序列。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号