首页> 外国专利> System and method for identifying web communities from seed sets of web pages

System and method for identifying web communities from seed sets of web pages

机译:从网页种子集中识别网页社区的系统和方法

摘要

An improved system and method is provided for identifying web communities from seed sets of web pages. A seed set of web pages may be represented as a set of seed vertices of a graph representing a collection of web pages. An initial probability distribution may be constructed on vertices of the graph by assigning a nonzero value to the vertices belonging to the seed set. Then a sequence of probability distributions may be produced on the vertices of the graph by modifying the probability distribution over a series of one-step walks of the probability distribution over the vertices of the graph. For each probability distribution produced in the sequence, level sets of vertices may be generated, and a level set with minimal conductance may be selected for each probability distribution. The level set with the least conductance may then be output representing a community of web pages.
机译:提供了一种用于从网页的种子集中识别网络社区的改进的系统和方法。网页的种子集可以被表示为表示网页的集合的图的种子顶点的集合。通过将非零值分配给属于种子集的顶点,可以在图的顶点上构造初始概率分布。然后,通过修改在图的顶点上的概率分布的一系列单步走上的概率分布,可以在图的顶点上产生概率分布的序列。对于序列中产生的每个概率分布,可以生成顶点的水平集,并且可以为每个概率分布选择具有最小电导的水平集。然后可以输出以最小电导率设置的级别,以表示网页社区。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号