Searching for "google"

Hardin MD Notes, Dec 20, 2000

Eric Rumsey

Computers with awareness of themselves and other computers are (so far) only in science-fiction, but perhaps a step in the direction of computer self-consciousness is the ability of a search engine to search for itself and other search engines -- In early December I searched in Google and 7 other search engines (listed below) for "google" and examined the first 100 hits in each, with interesting results that reveal a lot about how search engines work (think?).

In all of the search engines the pages retrieved generally fall into 2 categories: internal Google pages and external pages (not at the Google site) with commentary and discussion of Google. The most interesting links to me (and I think to most people doing a search for "google") are the ones in the second category.

Google excels at finding commentary sites

Actually, what triggered my interest in doing this little study was the observation that two commentary pieces that I have written about Google in the last year were highly ranked in a search for "google" on Google. What especially piqued my interest is that these two articles expressed widely different opinions on Google -- One of them (Google likes directory sites) was strongly favorable, and the other (Searching for directory sites: Google - Yahoo! questions) was rather unfavorable. To the great credit of Google's link analysis system, however, both of them were ranked highly: the favorable article was number 6, and the unfavorable one was number 16. (The favorable one is higher, I think, partly because it was written earlier, but also because it really is the more significant of the two articles, in the author's opinion.)

So, having found that my own commentary articles about Google were highly (and appropriately) ranked in Google, I suspected that other highly-linked commentary pieces would follow close behind. Looking at the first 100 hits in a search for "google" on Google, my suspicion was confirmed -- Most of the links were indeed to commentary pieces about Google.

Other search engines generally did find a fair share of commentary sites on Google, but they didn't find as many high-quality sites as Google did. Google found a higher proportion of the commentary sites that I've come to recognize over the last several months as I've followed the development of Google. Specifically, search engines that found a fairly high proportion of commentary sites are Northern Light, Hotbot, Alta Vista, and Excite; finding almost no commentary sites were Lycos and Fast.

Commentary sites found by Google were generally a step above the other search engines. The others tended to find many titles that looked like they had only a peripheral discussion of Google, with some laughable, quirky extremes: 2 search engines (InfoSeek and Northern Light) had top-15 hits for Barney Google, the cartoon character and in 2 others (Fast and InfoSeek) the bottom third of the top 100 hits was virtually all for the Weather Underground site, which apparently has some peripheral relationship to Google.

Other search engines find more internal Google pages

An especially interesting contrast between Google and other search engines was the proportion of internal Google pages vs. external commentary pages about Google. As mentioned, most of the search engines other than Fast and Lycos did have some commentary pages -- All of them, especially Fast and Lycos, also had many internal pages that are part of the Google site. Google, however, had only a few internal pages -- 4 in the first 15 links, and no others in the rest of the top 100.

Let me repeat that -- In searching for "google," other search engines find many more internal Google pages than Google does. Isn't that a surprise? Well, maybe -- But thinking about how Google works compared to other search engines may explain the difference -- Google apparently does a better job of finding commentary pages that are ABOUT Google because its ranking is based on link popularity, and so it learns that people around the Web are more likely to make links to commentary pages on Google than to internal Google pages. More traditional search engines, on the other hand, in their ranking algorithm, give more weight to pages that are at the site that has the same name as the search term.

Google's link-analysis shines

So, once again Google shows why it has captured the fancy of the net -- Other search engines get bogged down by the sheer volume of pages at the Google site, and the many pages discussing Google at other sites, and so they fail to find what most people doing this search want: good commentary and discussion of Google. By simply relying on its link-popularity method, as it does for every other subject, Google on the other hand, finds the best sites.

