|
|
AIDS: Search engine tipsHardin MD Notes, July 25, 2000 eric-rumsey@uiowa.edu There are several complicating factors in using search engines to search for AIDS. A primary one is the fact that the word is an acronym which is also a word (as in hearing aids, study aids, etc). A search engine that's able to distinguish upper-case will generally do a better job at finding upper-case acronym words like AIDS. Another complicating factor is that the word AIDS happens to end in "s" -- Search engines that lump singular and plural forms of words together don't distinguish "AIDS" from such things as "first aid." To compare how search engines do in finding AIDS, I did searches in several of them, and looked at the first 100 hits to see how many of them were on AIDS the disease, as opposed to other meanings of the word. I made several interesting discoveries about the techniques used by the search engines to distinguish ambiguous meanings, which are useful not only in searching for AIDS, but for other ambiguous subject terms as well. Search engines that distinguish upper-case -- Search term is "AIDS"Alta Vista - 1.7 million
hits Other search engines -- Search term is "aids"The other search engines, although they don't distinguish upper-case terms, use other methods, with mixed results, to distinguish links that are on AIDS the disease. A powerful technique for distinguishing hits on AIDS the disease, used by the two search engines below, is to give the user a choice of different contexts for the word being searched. For the serious searcher, this is a excellent feature, that makes it quite easy to find the desired meaning of an ambiguous term. Northern Light - Searching for "aids" gives 5.8 million hits; since it lumps singular and plural words together, this list includes a relatively low proportion of hits on AIDS the disease. But the "custom search folders" feature gives the user the option to specify particular contexts, with the top folders being Financial aid, HIV/AIDS (Acquired Immune Deficiency Syndrome), Humanitarian aid, and First aid. The HIV/AIDS folder has 158,000 hits, with all of the top 100 being on AIDS the disease. Direct Hit - As with Northern Light, a search for "aids" gives an undiffentiated list, with a low proportion of hits on AIDS the disease. But its "related search" on "aids disease" produces a list of relevant hits. Several of the search engines are probably using link analysis techniques to help them to find links on AIDS the disease. Link analysis, as typified by Google, works by analyzing the number of links to a site, and the importance of the pages making links. Since link analysis was introduced by Google, other search engines have adopted similar methods, though without the fanfare of Google. As far as I know, the only major search engines that have announced that they are using link analysis to rank hits are Excite and Direct Hit. Preliminary findings of a survey I'm doing, however, indicate that other search engines are also experimenting with it. Some of the search engines below use an exact word search, so that when the search term is "aids" they search only for that exact word. Other search engines put singular and plural forms together, so that a search for "aids" retrieves both the words "aids" and "aid." Google - 1.0 million hits;
all of the first 100 hits on AIDS the disease; exact word search Webcrawler - 25,000 hits;
96 of the first 100 hits on AIDS the disease; exact word search Lycos - 1.9 million hits;
92 of the first 100 hits on AIDS the disease; exact word search Excite - Number of hits is
not given, 92 of the first 100 hits on AIDS the disease Hotbot - 2.5 million hits; 90 of the first 100 hits on AIDS the disease; exact word search FAST - 1.9 million hits;
85 of the first 100 hits on AIDS the disease ; exact word search AOL - 27,000 hits, 32 of the
first 100 hits on AIDS the disease; singular/plural word search A qualifier to this article: It should be obvious that searching for AIDS in general, without combining it with some other topic, is best done in a good directory, such as those contained on the Hardin MD AIDS page. Using search engines is the best way to search only when AIDS is being combined with some other, more specific topic.
Hardin Library for the Health Sciences, University of Iowa Please send comments to hardin-webmaster@uiowa.edu The URL for this page is http://hardinmd.lib.uiowa.edu Last updated |