Google
could be five times faster
Researchers at Stanford
University have published a paper on how to give the Google
search engine a huge speed boost
Users of the Google search engine like it because
it's fast, but a team at Stanford University has come up with
ways to make it up to five times faster.
With the extra speed, Google could tailored for
each user, according to the team. For example, a sports-loving
Google user looking for "tiger" will see pages only on golfer
Tiger Woods, not large felines from Asia.
At
present, Google's ranking system relies on a method called
PageRank, an invention of co-founder Larry Page which
calculates the popularity and relevance of Web sites based on
how many other sites link to it.
"Computing PageRank for a billion Web pages can
take several days. Google currently ranks and searches three
billion Web pages and each personalised or topic-sensitive
ranking would also require a separate multi-day computation,"
the university said in a statement.
To speed up PageRank, Stanford researchers have
developed a trio of techniques based on a branch of
mathematics called numerical linear algebra. These methods are
described in three papers.
The first method from the Stanford team,
BlockRank, offers the most significant gain, speeding up
PageRank by three times, they claim.
The researchers make use of their discovery that
on most sites, up to 80 percent of links point to other pages
on the same site -- each site looks like a thick block of
links.
PageRank processes each link individually, but
with their more efficient BlockRank method, these same-site
links are processed as a unit, before moving on to links
outside the site.
The second method involves the use of
extrapolation. Before scanning the Web, certain assumptions
about a site's importance are drawn up.
As the scanning continues, these assumptions are
either proven or disproved, with the accuracy increasing as
more links are processed. A site's rank is extrapolated --
guessed at -- when a reasonable amount of evidence is
acquired. Compared with PageRank, which only knows a site's
rank after exhaustively trawling the Web, extrapolation works
50 percent faster, say the researchers.
The third method, called Adaptive PageRank, relies
on the fact that lower-ranking sites tend to be computed
faster than higher-ranking ones. By dropping further
processing of such quickly-computed sites, a speed boost of up
to 50 percent can be won, they said.
While these methods have their individual merits,
the Stanford team believes they can offer even greater returns
when combined.
"Further speed-ups are possible when we use all
these methods," said Sepandar Kamvar, one of the members of
this project. "Our preliminary experiments show that combining
the methods will make the computation of PageRank up to a
factor of five faster.
"However, there are still several issues to be
solved. We're closer to a topic-based PageRank than to a
personalised ranking," he added.
The Stanford team's theories will remain theories
for now -- they don't appear to have any official ties to
Google itself.
"Google appreciates any contributions that further
the study of hyperlink analysis on the web," was a spokesman's
reply to CNETAsia when asked whether Google will consider
using the team's methods, or if the privately-held company was
involved in the university team's efforts.
The Stanford team presented its paper on these
Google enhancements at the Twelfth Annual Word Wide Web
Conference in Budapest, Hungary, last week.
For everything
Internet-related, from the latest legal and policy-related
news, to domain name updates, see ZDNet UK's Internet News
Section.
Let the editors know what you
think in the Mailroom.
|