|
Searching for directory sites: Google, Metacrawler, & IxquickHardin MD Notes, June 21, 2000 eric-rumsey@uiowa.edu One of the most useful, though unheralded, features of Google, discussed in a previous article (Google likes directory sites), has been its knack for finding good directory sites. To compare this ability of Google with other search engines, I recently did comparison searches in Google and in two "metasearch" sites, that search several different search engines, MetaCrawler and Ixquick. Metasearch engines were used for the comparison because they provide a good way to determine the "average" ability of many search engines to find directory sites, and because they indicate which search engines do a particularly good job at it. To do the comparison, I searched 10 medical disciplines (anesthesiology, cardiology, dermatology, etc.) -- words that are frequently used in the titles of medical/health directory pages -- to see how many directory sites are retrieved by the different tools in the first 10 hits. Google clearly comes out best in the comparison, finding a total of 26 directory sites in the 10 searches. It finds directory sites in all 10 cases, and in 9 of them it finds 2 or more directory sites. Metacrawler finds a total of 14 directory sites and Ixquick finds 13; each of them returns zero directory sites in 3 cases. For more information on this study, see the Detailed Results page. Patterns of retrievalEven more interesting than the number of sites returned by the three
tools is the type of sites returned. Before discussing this, it's helpful
to distinguish the different types of directory sites that are retrieved
in these searches: The pattern of directories returned by the three tools is notable. Metacrawler and Ixquick have similar patterns -- For both of them, about half of their hits are to two prominent umbrella directories (About.com, Hardin MD), and the other directory hits are to independent directories. Ixquick does especially well in finding high-quality independent directories. Google finds specialized medical directoriesLike the other search tools, Google also returns several hits for the prominent umbrella directories, About.com and Hardin MD. But unlike the other tools, it also returns many hits for the specialized medical umbrella directories, MedWebPlus, Martindale, and ScienceKomm Journals, essential sites for any serious medical research on the Web. These tools and other specialized medical umbrella sites are rarely seen in the top-ranking hits on the search engines other than Google that are included in Metacrawler and Ixquick. Ixquick's and Metacrawler's search enginesFor a full list of the search engines included in the metasearch tools, see the details page. Though I didn't record details of the specific search engines that found directories in Metacrawler and Ixquick searches, my general observation was that this was quite varied, with none being especially predominant. The one minor exception was that Webcrawler, a relatively small search which is included in both of the metasearch tools, did do a good job of finding directory sites. Metacrawler includes Google as one of its constituent search engines. It's a bit surprising, then, that Google in several cases returns directories in the first 10 hits when Metacrawler does not. This is apparently because the frequency of hits for other non-directory sites by Metacrawler's search engines outweighs the high ranking in Google. General observationsFor now, Google is clearly the best search tool to start with in looking for specialized medical directory sites. Metacrawler and Ixquick do return a smattering of good directory sites, and bear watching in the future, but they are not as reliable as Google. But with the great success of Google, other search engines will likely be catching up soon, and metasearch tools provide an excellent way to monitor this. For more information on this study, see the Detailed Results page.
Google | Metacrawler | Ixquick
Hardin Library for the Health Sciences, University of Iowa Please send comments to hardin-webmaster@uiowa.edu The URL for this page is http://hardinmd.lib.uiowa.edu Last updated |