首页> 外文期刊>BioTechnology: An Indian Journal >Block matrix-based marpreduce pagerank algorithm web structure mining applied effect research
【24h】

Block matrix-based marpreduce pagerank algorithm web structure mining applied effect research

机译:基于块矩阵的Marpreduce Pagerank算法Web结构挖掘应用效果研究

获取原文
           

摘要

Web page not only has text messages, but also contains hyperlinks that points from one page to another one and hyperlinks contain potential annotations. Lots of Web hyperlinks information provides relative Web page contents correlation, quality and structure aspect information, the information reflects documents containment, quotation or affiliation relations. And Web structure mining is mining derived knowledge from World Wide Web organization structure and link relations on Web pages link structures. In information searching, it can regard high authority score and pivot score???s webpage as high quality webpage, during searching process, it priority provides it to users, in this way it can discover network community by analyzing hyperlinks??? topology and construct a digraph for searching result or assigned webpage set. The paper on the basis of introducing Web structure chart, it analyzes Pagerank algorithm applied merits, and then researches on block matrix-based Mapreduce PageRank algorithm, the method uses block matrix thought to reduce every time iteration mixed phase and rank phase time consumption so that let every time iteration only execute one Mapreduce phase, for the algorithm, the paper compares it with other two algorithms, gets that the algorithm superiority degree on operation time that provides theoretical basis for Web structure mining techniques.
机译:网页不仅具有文本消息,而且还包含从一个页面指向另一页面的超链接,并且超链接包含潜在的注释。许多Web超链接信息提供相对的Web页面内容相关性,质量和结构方面的信息,这些信息反映了文件的包含,引用或从属关系。 Web结构挖掘是从万维网组织结构和Web页面链接结构上的链接关系中获取知识。在信息搜索中,它可以将较高的权限得分和关键得分的网页视为高质量的网页,在搜索过程中,它优先提供给用户,从而可以通过分析超链接来发现网络社区。拓扑并构造图以搜索结果或分配的网页集。本文在介绍Web结构图的基础上,分析了Pagerank算法的应用优点,然后对基于块矩阵的Mapreduce PageRank算法进行了研究,该方法利用块矩阵思想减少了每次迭代的混合相和秩相的时间消耗,从而让每次迭代仅执行一个Mapreduce阶段,对于该算法,将其与其他两种算法进行比较,得出该算法在运算时间上的优越性为Web结构挖掘技术提供了理论基础。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号