Search Engine Showdown
[an error occurred while processing this directive]

Size Analysis Methodology
by Greg R. Notess

The Search Engine Showdown relative size analysis and total size estimates are studies based on the actual verified results from the search engines using specific search terms that avoid differences in search processing. The methodology described here and the terms used are from the Fast special supplement report which only compared the three largest. The terms used are different from the regular analyses. Those are not posted publicly to avoid having a search engine add records to their database just to increase their score in these tests. For that reason, these words will not be used again. Otherwise, the methodology remains the same.

Methodology

To compare the sizes of the search engine databases, the study uses 25 specific queries that meet the criteria listed below. The results of each query are verified when possible and only the number of hits that can be displayed are counted. The query terms and the results for the Fast Special Supplement Study are available on the detailed results page.

Why use such a small sample that retrieves so few results? For one, there is no way to verify the larger numbers of results, and there are known problems with search engines' abilities to count results accurately. Also, the smaller search set makes it easier to check for any other processing inconsistencies. The following query criteria make getting a true random sample of terms nearly impossible, although terms are chosen from wide variety of subject areas.

Query criteria:

  1. Only single words are used to avoid any variation in the processing of multiple term searches
  2. Terms were drawn from a variety of reference books that cover different fields. See Appendix 3 for general definitions and the sources.
  3. Any term used must find less than 1,000 results in the AltaVista Advanced Search, since numbers higher than that cannot be verified on AltaVista.
  4. Since Northern Light automatically searches both English plural and singular forms of words, query terms were chosen that cannot generally be made plural. This was checked by pluralizing the word and running a search on AltaVista or Fast. Only those terms where the plural form found zero results were used.

Study Date: All searches are run on the same day.

Search Engine Notes

  • Fast at All the Web is run in the advanced search mode with 100 results selected. The final record number is verified.
  • AltaVista using the Advanced Search so the results are not clustered and can be viewed up to 1000. Final record number is verified, and the last displayed record number is used.
  • Northern Light. Regular search with limit set to Web results only. No Special Collection records are included.

Terms Used

The list used for the Fast Special Supplement Study along with a rough, general meaning and the source for the term covering a range of subjects.The words listed are different from those used in the regular comparison. Those are not posted publicly to avoid having a search engine add records to their database just to increase their score in these tests. For that reason, the words listed here will not be used again.

Detailed Results

The results by search from the Fast Special Supplement Study along with percentage comparisons.