Hardin MD Notes, Oct 11, 2000

Eric Rumsey

[This article replaces Google falls for Yahoo!, which was partially retracted. The report here presents questions that still remain from the earlier article.]

Last November I discovered that Google has the uncanny ability to sniff out high-quality, but little-known directory sites. (The report of those findings, Google likes directory sites, is important background, and should be referred to in reading this article). I've continued to monitor Google's ability to find directory sites, and I've recently begun to compare it to other search engines. In general, Google is still tops -- There is wide variation among other search engines, with some doing a much better job of finding directory sites than others, but Google continues to stand out as the best.

Since November, we've continued to monitor the placement of directory sites in Google searches, doing the same thorough check every 3 months that we did originally in November. To do this check, we search for 21 words that appear in Hardin MD page titles (e.g. cardiology, dermatology, nephrology), and count the number of directory sites that are returned in the first 100 hits. Google often is able to find the majority of the high-quality directory sites that we list in the Hardin Meta Directory. As mentioned above Google is still doing an excellent job as a "directory hound." But in the last several months we have begun to see that the rankings of Yahoo! pages in Google searches has risen rapidly, and in this article, we present our findings on this.

The remarkable thing in November was that Google not only found the high quality directories that we have found for the Hardin Meta Directory (Hardin MD), but often it even ranked them in the same relative order that we did. A prime example of this was Yahoo! -- While Yahoo! has moderately useful directory pages in all subjects, including the medical subjects in the the Hardin MD, it's basically a generalist tool, not taken seriously by people doing specialized research. So the the Hardin MD does include Yahoo! pages, but they're usually ranked well below other more comprehensive pages done by people who specialize in health and medicine. It was striking, back in November, that Google saw things much as we do -- Yahoo! pages were usually included in the first 100 hits, but they were ranked well below other directory sites, in most cases exactly the same ones that we also rank higher in the the Hardin MD.

Yahoo! rises in Google rankings

We began to notice in early March that Yahoo! pages seemed to be rising in Google search rankings. This was several months before Google's alliance with Yahoo! was announced on June 26, so we had no reason to think that there was any connection. But Yahoo!'s rankings kept rising in the succeeding months, and the announcement of the Google-Yahoo! alliance naturally raised questions about the connection.

A key point is the size and depth of the Google index. The claim has been made that the reason for Yahoo!'s sudden climb has to do with the size and depth of the Google index -- Yahoo! has risen because more pages are being indexed, or maybe because Yahoo! is being more thoroughly indexed after the alliance. In answer to these suggestions, however, the accompanying table of data shows that the rise in Yahoo! rankings in Google searches occurred well before the Google-Yahoo! alliance, and also well before the increase in size of the Google index.

Interpreting the data (See accompanying table)

In looking at the table, note a couple of things:

I'm making the assumption that "Number of total hits" is correlated fairly closely with the overall size of the Google index. This increased relatively modestly from November to May (14%), but much more by August (474%).

The first indications of Yahoo!'s sudden rise in the rankings in Google searches is in early March. This is at least 2 months before the size of the Google index grew dramatically (sometime between May 17 and August 23); it's also 3 1/2 months before the announcement of the Google-Yahoo! alliance was announced on June 26.

Unanswered questions

So the question is: Why did Yahoo! rise so rapidly in Google's rankings several months before the Google-Yahoo! alliance was announced and before the size of the Google index grew? Here are some possible explanations:

Even though Google index size didn't increase significantly before May, maybe it was indexing Yahoo! pages more thoroughly.

Maybe Yahoo! opened its files to Google's spider, allowing its pages to be more thoroughly visited and indexed by Google.

Then, of course, there's the possibility that Google used some "artificial" means to raise Yahoo! in its rankings. (Would this even be possible to do?)

Seemingly inevitably we're left with the question -- "Was it 'fair practice' for Google (or Yahoo!) to initiate procedures that would lead to raising Yahoo! in Google's rankings during the time the two parties were negotiating their alliance, as much as 3 1/2 months before it was made public? It may or may not be legitimate that Yahoo! would rise in Google's rankings after they made their alliance. But is it OK that this would happen well before the alliance?

Google's response

For the record, I did talk with Kimberly Vogel at Google, to see if they have an explanation for this. Her response is that "Google's index has grown significantly since January 2000 and it has indeed uncovered more Yahoo! pages." I told her that my data seems to raise questions about this because the big leap by Yahoo! in Google rankings seems to have occurred before the explosive rise in the Google index size. She had no further explanation than this, just saying that what I recorded is an "anomaly."

In concluding, I want to emphasize that the purpose of reporting these findings is not to hurt Google -- I continue to use Google heavily, both for finding directory sites, and for general searching. But I do have high expectations for Google, and I present the evidence here in the hope that Google can avoid the pitfalls of new alliances, and remain what it has been since it was launched, the top dog of the search engine world.

Accompanying table

