Internet Mathematics

A Two-Stage Algorithm for Computing PageRank and Multistage Generalizations

Chris P. Lee, Gene H. Golub, and Stefanos A. Zenios

Source: Internet Math. Volume 4, Number 4 (2007), 299-328.

Abstract

The PageRank model pioneered by Google is the most common approach for generating web search results. We present a two-stage algorithm for computing the PageRank vector where the algorithm exploits the lumpability of the underlying Markov chain. We make three contributions. First, the algorithm speeds up the PageRank calculation significantly. With web graphs having millions of webpages, the speed-up is typically in the two- to three-fold range. The algorithm can also embed other acceleration methods such as quadratic extrapolation, the Gauss-Seidel method, or the Biconjugate gradient stable method for an even greater speed-up; cumulative speed-up is as high as 7 to 14 times. The second contribution relates to the handling of dangling nodes. Conventionally, dangling nodes are included only towards the end of the computation. While this approach works reasonably well, it can fail in extreme cases involving aggressive personalization. We prove that our algorithm is the generally correct way of handling dangling nodes using probabilistic arguments. We also discuss variants of our algorithm, including a multistage extension for calculating a generalized version of the PageRank model where different personalization vectors are used for webpages of different classes. The ability to form class associations may be useful for building more refined models of web traffic.

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.
If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.im/1243430809
Mathematical Reviews number (MathSciNet): MR2522947


2009 © A K Peters, Ltd.