Introduction

Nowadays the Internet consists of several million pages, and it is increasingly easy to create new ones, even for nonexpert people. With this huge amount of web pages available, is it reasonable to think that all pages are equally "important"? This is question is basic for web search engines face, that is, web sites that search we page throughout the Internet based on some given criterion, usually, the inclusion in the web page of a certain word or set of words.

In fact, in most cases the user that asked for the search only looks at a few of the first results obtained, ignoring the remaining ones, usually a few thousands of pages! How can we be sue that the first listed results of the web search are the most "important" ones, that is, the web pages that are most likely to correspond to the what is expected from the search?

The solution to this problem used by the popular web search engine Google gives, to each existing web page, a numerical value, called the PageRank, that reflects its "importance". Hence, when performing a web search, the first listed results given to the user are just the ones that have the highest value of PageRank.

The following page explains

  • how to mathematically define the PageRank
  • how to actually compute the PageRank
  • how to interpret the PageRank as a probability

Translated for Atractor by a CMUC team, from its original version in Portuguese. Atractor is grateful for this cooperation.

(*) This work was carried out under the guidance of Professor Maria Carvalho from the Universidade of Porto, under a grant by the Calouste Gulbenkian Foundation to develop a project for the promotion of Mathematics in Atractor.
Since many browsers are blocking Java nowadays, it was decided to convert to Javascript the original applets of this section. This conversion was carried out within the scope of a support received from FCT - Fundação para a Ciência e a Tecnologia, through FACC (Fundo de Apoio à Comunidade Científica).


Difficulty level: University