Search Engine Showdown
[an error occurred while processing this directive]

Search Engines Statistics: Database Overlap
by Greg R. Notess

Data from Sept. 10, 1999

Pie Chart Still little overlap!

Searches Used:
Total Hits:
Unique Hits:
5 small ones
326
140

This analysis compares the results of five very small searches run on thirteen different search engines. The five searches found a total of 326 hits, 140 of which represented unique pages. Of those 140 hits, 66 were found by only one of the thirteen search engines while another 30 were found by only two.

Search Engines Analyzed:

  • Northern Light
  • AltaVista
  • Fast
  • Infoseek
  • Excite
  • Lycos
  • Google!
  • And the Inktomi crew:
    • MSN Ink
    • Snap
    • Yahoo! Ink
    • AOL
    • Anzwers
    • HotBot

Even with six Inktomi-based databases (Anzwers, Snap, MSN Web Search, AOL, HotBot, and Yahoo!'s Inktomi database), there was a low degree of overlap. However, the Inktomi database have begun to look more similar, often finding the same hits. On these searches, none of the Inktomi search engines found hits that were not also retrieved by at least one of the other Inktomi partners. This changed from the May analysis.

See the more detailed analysis of unique hits to gain a sense of how the 66 pages found by only one search engine were distributed.

Previous Comparisons:

  • May 1999: Five searches on eleven search engines. 267 hits, 122 unique pages. Over half found by only one search engine.
  • March 1999: Four searches on ten search engines. 202 hits, 97 unique pages. None found by more than five search engines.
  • Jan. 1999: Four searches on ten search engines. 176 hits, 83 unique pages. None found by more than six search engines.
  • August 1998: Four searches on five search engines. 103 hits, 70 unique pages. None found by all five search engines.
  • May 1998: Four searches on five search engines. 95 hits, 77 unique pages. None found by all five.
  • Feb. 1998: Four searches on five search engines. 103 hits, 62 unique pages. Three found by all five search engines.
  • October 1997: Four different searches on four search engines: 220 hits, 12 found by all four
  • September 1997 and June 1997 found no pages in common among four small searches on the four largest search engines at those times. (No charts available.)

While decisions about which Web search engine to use should not be based on size alone, this information is especially important when looking for very specific keywords, phrases, and areas of specialized interest. See also the following statistical analyses: