Cynthia Vaskis

SLM521 Spring 2004

Search Engine Assignment

4/18/04

File: srengine.htm

 

Summary results from Search Engines Assignment

 

The purpose of this lesson is to evaluate different types of search engines and to pick the best and worst in each category.  I selected the types of search engines to be the General search engines, the Meta-Search engine, the News search engines, the Education search engines and the Medical search engines.  For the following General, Meta-Search and News engines I used queries from different topical areas.  They are listed below with those topics mentioned in parentheses.

 

1.  Mars rover design specifications Spirit (current news, science, technology)

2.  Colorado River explorer Powell (historical geography)

3.  Bush moon habitat robots future 2005 (government, science, current news, politics)

4.  American Idol TV show George Huff (popular culture, Hollywood, television, personal stories)

5.  Small pox terrorists United States Maryland (medical, government defense, current events, geography)

6.  Latvia cuisine apple dessert (unusual topic, world countries and their culture, food)

 

The News search engines sometimes require you to enter a specific newspaper, magazine, radio or TV station to use as a source for the search.  They needed specific time frames to search and locations around the world and in what languages you wanted to use.  This was too complicated for me but if I was a reporter and knew about a specific story I wanted to read about, this would probably be very useful.  These do not do well in general topic searches because you have to specify where you want to search and this limits your overall coverage area from which to draw the results.

 

The Education search engines came in two different flavors or structures.  Some were completely organized lists of topics that you had to choose from and there was no way to enter a query.  Others allowed you to enter a query or search request and it would tell you anything it had that matched it or where you could go to look for it.  These proved to be more useful to me. 

 

The Medical search engines are listed last and were very complicated to use.  You almost needed to know what category the information was in before you started looking.  The ones that had a query or search request entry window were the most useful to me but still did not return exact matches.  They made it difficult to combine words into one search phrase such as “mad cow disease” and it would instead look for one word instead of all of them together.

 

Where I Found the Search Engines to Review

 

To find the directories in the General Search Engines and Meta-Search Engines I used those mentioned in the pages of linked material to the assignment Search Engines.  To find the directories used in the categories News, Education, and Medical categories below I looked at this URL (http://search.yahoo.com/search/dir?p=web%2Bdirectories&h=C) where the different topic directories listed the most popular search engines.

 

General Search Engines Review

 

The best two general search engines were www.AltaVista.com and www.Google.com.  They both returned a lot of results but AltaVista had pictures associated with the requested topic which were fantastic.  It also had linked to any videos with text pages associated with the query which was really helpful in the Web Quest assignment.  Google Advanced Search has the option to pick a time frame to search within and where the search should happen within the text (in the title, the URL, the main body or the text, etc.)  It also lets you pick a language and SafeSearch (filtering) if you want it.  You can do complicated queries easily there with the options such as “all” the words which means use AND between each word, with the exact phrase (as if it was in quotes), with at least one (OR between every word), and without the words (NOT AND).  I like the way Google lists the results with the date of the entry (when it was posted to the web).  It also has extras on the side such as posters on that topic.

 

Close second bests are www.AOL.com, www.MSN.com and www.Lycos.com which seem to retrieve about the same information.  AOL also allowed you to create complicated search queries in its advanced area.  You must sign on with AOL.com but since I already am a member it was not a problem.  AOL uses search.msn.com when it can’t find what it wants.  Some others that have extra stuff to look at are www.AlltheWeb.com with options as news, pictures, video, and “any language” and the search engine www.NPR.org with radio broadcasts.  The NPR engine uses AND or OR functions between multiple words which was too tedious for long queries.  Many give the sponsored links above the list of hits which are helpful if you want to see who’s supporting those Web hit sites such as www.Netscape.com and www.Hotbot.com.  I found the sponsors helpful when I was looking for information to use in my Web Quest assignment.  Hotbot seems to use Google and AskJeeves at times.  I was disappointed in that I couldn’t find where Netscape lists how many hits were obtained but just keeps you wading through the huge list to see that there were a lot.  The search engine Northern Light Power Search, supposedly best for business applications, is not available without paying to the general public until July 2004 so I did not include it.  Also, I tried Teoma (www.Teoma.com) and its top web page comes up.  When I selected a query’s hit, the whole screen was taken over by www.JetSeeker.com and I couldn’t get out of it because it was not in a Window’s window (beware of this one).

 

The worst search engine in this category for my search requests was the Librarian’s Index www.lii.org.  This engine only accesses a very small 13,000 record database that describes other Web sites and really needs just one word requests to work properly.  I often use multiple word requests to find more specific information.  The important things to look for in a general search engine are how many pages world wide does it scan to bring you your answers and does it have methods to allow you to logically select combinations of topic words without using any functions between the words (they are assumed to be AND but actually give you the best matches at the top and less percentage matches after the top ones).  Some search engines require you to enter the functions AND, OR, and NOT AND between multiple words but I think that is too much work for my searches.


General Search Engine Table of Number of Responses per Query

 

All URLs in this category have a prefix of http://www.  I use the tilde label ~ in front of the number when the number of hits is approximate (the search engine mentions “about”).

 

General Search

Engines/

Search topics

Google

google.com

AOL

AOL.com

MSN

MSN.com

AlltheWeb

AlltheWeb.com

NLResearch

Search NPR.org

Librarian’s

Index

lii.org

Alta Vista

Advanced

av.com/

Advanced or

altavista.com

Lycos

Lycos.com

Netscape

Netscape.com

Hotbot

Hotbot.com

Mars rover design

Specifications Spirit

~7,800

~361

361

346

Only 1 good

answer

Used AND

got 2 hits

~474

478

I saw >400 but

no total listed

476

Colorado River

explorer Powell

~14,800

~5,208

1,516

4,917

Used AND got

only 2 hits

Used AND

got 0 hits

~5,672

5,668

I saw >100 but

no total listed

5,665

Bush moon habitat

robots future 2005

334

~208

208

149

Used AND did

got 0 good ones

Used AND

got 0 hits

~229 but used

msnbc.com

249

I saw a lot

listed

248

American Idol TV

show George Huff

~10,500

~1,290

1,259

879

Listed 6 but only

1 good one

Used AND

got 0 hits

~1,464

1,484

I saw a lot

listed

1,485 uses

tvtome.com

small pox terrorist

United States

Maryland

~2,210

~781

781

569

Used OR to get

~67, AND gave 0

Used AND

got 0 hits

~814

826

I saw a lot

listed

824

Latvia cuisine

apple dessert

~1,650

~97

97

75

~18 but not

exact matches

Used just

Latvia and

got 4 hits

~121

122

I saw about 10

good hits

122

 

Meta-Search Engines

 

The best Meta-Search engine for my purposes was a tie between www.Vivisimo.com and www.Dogpile.com.  Vivisimo seems to have good extensive general coverage of the subject and lists the results in nicely organized or ranked categories. It gave you the choice of how the search would be done, by using its Clustering Engine or Content Integrator or Enterprise Publisher.  It also listed some other useful links to ebay, PubMed via ClusterMed and to FirstGov.   Dogpile does the same listing of the results in ranked categories but seemed to come up with some unusual hits that other search engines didn’t find.  It also lists the Yellow and White pages which can be useful in finding a business related to your topic.  I liked the links to maps, weather, public records and classifieds and it has a joke of the day.  It also lists other related topic areas to refine your results.

 

Metacrawler doesn’t list the number of hits per category but allows you to look specifically in topical areas as web pages, audio, multi-media, news, and shopping.  SurfWax lists the number of pages it searched to get the resulting hits and lists what search engines were used to get them.  The www.copernic.com was really www.copernicagentbasic.exe and took too long to download so I did not review this one.  I used www.mamma.com (the “Mother” of all search engines) instead which was a good one but some of its listed hits were not accessible (no page available message came up) and probably not as dependable for listing up-to-date hit pages.  The engine www.Ixquick.com listed the number of best hits out of a number of pages scanned with matching results.  I’m not sure how useful the number of pages scanned is but someone might use it.  I would probably rate the www.mamma.com as not the best (worst) since I could not depend on all the hit pages to be currently available although it seemed to have good coverage like the others.

 

Most of these Meta-Search engines show you which General Search engines were used to get the hits.  Most of the time this information is listed at the bottom of the results page but some Meta-Search engines list the source engine right next to the hit itself which could be useful if you want to look further into the subject area.  You can usually select before the search is done which general search engines you want to include in the Meta-Search.  Advanced search options are also available with some engines and options as Web, News, or Extra.  Some of the Meta-Search engines perform clustering which organizes the search results on the fly while others perform a Content Integrator function which queries many sources at once.  Some search engines look for hits in a certain amount of time and return what it found instead of doing a total search no matter how long it takes.  I prefer the total search instead so that I don’t miss anything.  Vivisimo produces clustered results as does Dogpile and Metacrawler where they list the resulting number of hits per subcategory.  This is a really nice feature to help you focus on just the hit selections you want.  The important things to note are how many different general search engines it uses and how exact is the matching method (title match, general Meta word search).

 

Meta-Search Engine Table of Number of Responses per Query

 

Meta-Search Engines/

Search topics

Vivisimo

www.vivisimo.com

Dogpile

Dogpile.com

Metacrawler

Metacrawler.com

SurfWax

www.surfwax.com

Mamma

www.mamma.com

Ixquick

www.ixquick.com

Mars rover design

specifications Spirit

163

68

69

58 out of 3,150

pages searched

27

36 listed of 1,050

matching results

Colorado River

explorer Powell

157

85

76

79 out of 5,204

pages searched

47

55 listed of 7,700

matching result

Bush moon habitat

robots future 2005

128

69

59

62 out of 210

pages searched

32

39 listed of 204

matching results

American Idol TV

show George Huff

164

73 but only 2

relevant ones

72

58 out of 1,430

pages searched

34

44 listed of 1,304

matching results

small pox terrorist

United States Maryland

155

68 but only 9

relevant ones

66

61 out of 930

pages searched

34

36 listed of 792

matching results

Latvia cuisine apple dessert

131 but only 16

relevant to query

62 but only 7

relevant ones

61 but <20 hits

relevant to query

51 out of 530

pages searched

26

39 listed of 280

matching results

 


News and Media (as listed by Yahoo directories) Search Engines

 

These News search engines list various sources from which to obtain the data such as by newspapers, magazines, radio, or television.  Some let you access the Web by a general search engine as well.  I found that in most cases these News search engines were good at listing the top stories which is useful if you haven’t watched or heard the news in awhile.  They were not very good at searching for a general topic unless that topic has been in the news lately.  They do not go back farther in time than a few years.  One had a time limit of 36 months but you could pick a time period to search within which would be nice if you didn’t want to get results from other time periods.  Most of these News engines search world news but you could pick a locale (country, state, or city) in which to search.  I think all of them had a language selection that sometimes was required to be selected before any searching could be done.

 

I did not include the News Link (www.newslink.org) search engine because it required you to select a specific newspaper, radio or TV station, magazine, or unique source to search so that the scope of the search would be very limited and did not fit well with this assignment to test its ability to scan all of the news out there.  It allowed you to select the general search engines you wanted it to use (AltaVista, Big Hub, Excite, Go which is Infoseek, Google, Infohiway, Lycos, Magelean, Webcrawler, and Yahoo). 

 

The Voice of America search engine listed the top broadcast and press program stories which was good for current affairs but required you to select the language.  I could not get any good results out of it using multiple word queries because it works best with single word queries.  The www.news.google.com seemed to only search recent news stories and did not look back a few months or more.  The Publishers List (www.publist.com) engine requires you to search with a specific title of an article or book.  Since I did not have one, this search engine produced no results for me.  It also needed to have cookies enabled and I don’t think I do on my computer yet.  The News Directory (www.newsdirectory.com) also required a specific title to search and I didn’t have one so I didn’t use it.  The only ones that I could effectively use were www.CNN.com, www.news.google.com, and www.news.yahoo.com.  I think www.CNN.com was the best of all of them because it let you search the Weather, the World News, and All Politics.  It also listed the general search engines that it uses such as Google, LookSmart, Gigablast, MSN, Business.com, Open Directory, and Teoma.  It also had some extra features that I had not seen with other News engines.

 

My top pick or best News search engine by far was www.CNN.com.  It was easy to use and produced a lot of results.  It listed the top current stories and let you access many other topics such as Science & Space, law, Health, Weather, U.S. or World news, Business, Sports which were helpful in by already narrowing the searching area.  It also had radio and video of current news subjects.  My second best choices would be either news.yahoo.com or news.google.com and I would need to use shorter queries.  They both had nice picture clips by summaries of top stories in different categories such as world news, entertainment, sports, technology, politics, science, and health.  News.Yahoo.com had a video and audio section of key stories.  News.Google.com listed the sources where it got the stories or where you could find more on the story.  It also let you pick the news in a different language at the bottom of the page.

 

Many of the News search engines did not return results for my long queries and I think I would have gotten more information back with one or two word queries but I was trying to compare these with the General search engines so I kept the same queries.  I didn’t really have a worst News search engine but there were some I would probably never use such as Publisher’s List or News Directory because they required too much input from the person before the search could be performed.


News Search Engine Table of Number of Responses per Query

 

News Search Engines/

Search topics

CNN

www.CNN.com

News yahoo

news.yahoo.com

News Directory

www.newsdirectory.com

Publishers List

www.publist.com

Voice of America

www.voa.news.com

News Google

news.google.com

Mars rover design

specifications Spirit

3170

One good hit from

Anchorage, Alaska

No results with this

query

Need cookies

enabled

40 results under topic

Science & Technology

1 of 1 hit

Colorado River

explorer Powell

5960

None

No results with this

query

Need cookies

enabled

No relevant results but

some on Colin Powell

No match

Bush moon habitat

robots future 2005

263

None

No results with this

query

Need cookies

enabled

No results using all of

the query words

No match

American Idol TV

show George Huff

4200

53 hits because it’s

a current news topic

No results with this

query

Need cookies

enabled

No results using all of

the query words

52 because its been in

the news recently

small pox terrorist

United States Maryland

2300

One related story about

a Westminster, MD HS

No results with this

query

Need cookies

enabled

No results using all of

the query words

No match

Latvia cuisine apple dessert

179

None

No results with this

query

Need cookies

enabled

No results using all of

the query words

No match

 


Education Search Engines

 

The Global School House (www.globalschoolhousenet.org) search engine and the Federal Resources For Excellence (www.educationindex.com) search engine did not have a query area but had everything listed in categories.  The Global School House looked like a good resource for teachers but is not simple enough for a younger student to browse.  The Federal Resources For Excellence search engine targets the college students for information on how to get loans, jobs, research groups to join, tutoring help for college classes, and a coffee shop for chatting.  I found some good education articles under this one.  I think if I was just starting out in college it would be very helpful.  The other search engines allowed you to enter a query. 

 

The best two were Yahooligans and Ask Jeeves for Kids.  Yahooligans first gave a good coverage for the topic searched but then gave the option to look at more related sites which was very useful.  It also had a place to ask Earl, to look at a reference section and any other information that Yahooligans had on the topic. I think Ask Jeeves for Kids (www.ajkids.com) was good because it had related questions about the topic such as “Would you like to see a movie on this, or a definition of the search word, or a timeline, or a report?”  It also listed where it got its information (the source) with a link to go there for more details.  I don’t think there was any Worst engine because they had different approaches to the way they listed, and users would access, that information which made it hard to compare them on equal ground.  Each of the Education search engine URLs below should be preceded by “http://www.” to get to the site.


Education Search Engine Table of Number of Responses per Query

 

Education Search

Engines/

Search topics

Global School House

globalschoolhousenet.org

Federal Resources

For Excellence

eduationindex.com

Yahooligans

yahooligans.com

Ask Jeeves for kids

ajkids.com

Academic Info

academicinfo.net/

index.html

Education planet

educationplanet.com

INFOMINE

ucr.edu

Internet learning

No search entry area

No search entry area

   34

19 total

Sponsored by

University of Phoenix.

9799

The University of CA at

Riverside’s web home page.

computer learning

No search entry area

No search entry area

   13

31 total

It had no query area

5533

It responds by identifying

math games

No search entry area

No search entry area

   75

61 total

but lists degrees available

2387

an area on campus where

calculus

No search entry area

No search entry area

     4

318 total

and has many topics to

98

the information can be found.

electronic learning

No search entry area

No search entry area

     3

18 total

browse through.

5208

It had 3 math game hits,

math tutor

No search entry area

No search entry area

     6

28 total

 

909

calculus practice sites, actual

math tutors & Learning or

Computer centers.

 


Medical Search Engines

 

One of the best Medical search engines was Healthfinder at www.Healthfinder.gov/default.htm because it narrowed the query responses to only good matches.  It had a Health library to select from and a place to find healthcare organizations as well.  It listed recent health news and current health topics which were easy to browse. The other best search engine was Achoo at www.achoo.com/main.asp which was also good at returning many responses to look through.  The Achoo engine called up another engine named NCBI PubMed (a widely used and respected medical search engine) and used it for the queries.  The top window gave you many choices about where to find information such as in reference sources or a Human health directory instead of depending on searching the web with a topic.  It also listed business companies and health products directories.  It also gave you several different medical search engines to pick from such as Medline (NLM), Merck manual as well as the Internet.

 

I found that the Nurse Friendly web site had a lot of choices and it was difficult to find its web search engine link in the middle of the page on the left.  It was also case sensitive meaning “west nile” returned no hits but “West Nile” did return some hits.  Overall, I think you need to know a specific topic to select from one of their organized menu lists rather than do a general search on the topic to find the information on a topic.  It was hard to locate some of the query lines in some engines.  The Medline Nurse Friendly search engine page called up www.kfinder.com/newweb/ to look in journals, books, newspapers, encyclopedias and magazines for related articles.  It also used other web search engines to look up an answer such as www.fastsearch.com/med and www.update-software.com/cochrane.  I don’t know how much I would use these sites since it is hard to know how relevant the results are to the query words.  It would be faster to ask a nurse directly or a doctor and certainly don’t depend on the search results to diagnose your problems.  It is interesting to read about some of the technical terms and symptoms of certain diseases though.  If I had to pick a worst it would be the one that looked too confusing on its top page which was the Nurse Friendly web site because it had so much stuff there to weed through.  Add the prefix of “http://www.” to all URLs in the table.

 

Medical Search Engine Table of Number of Responses per Query

 

Medical Search

Engines/

Search topics

Healthfinder

healthfinder.gov/

default.htm

Health Web

healthweb.org

Achoo

achoo.com/

main.asp

MEDLINES – The Nurse Friendly

nursefriendly.com/medicine/

UK Health Centre

healthcentre.org.uk/

hc/index.html

Small pox terrorist

  26

3 good hits

6798 used

questia.com

  5816

1 hit

Mad cow disease

    6

No hits

1521 hits

  3 but not all on topic

30 irrelevant hits

Heart murmur

183

49 hits

2063 hits

  4

9 good hits

Varicose veins

    2

No hits

11525 hits

  17

2 good hits

West Nile virus

    3

23 but not all

West Nile virus

1342 hits

  8

No hits

Lyme’s disease

    5

229 for disease

only some for

Lyme’s disease

5798 hits

  3

30 hits on neurological

disorders but not only on

Lyme’s disease

 

The Worst Search Engine Overall in each category was listed above.  It is not practical to compare the Education and Medical search engines with the General, Meta, and News ones since they have different purposes.  I looked at one more search engine that others thought was not so useful called www.WiseNut.com.  This web search engine is rather simple but it has a nice feature in that you can take a sneak-a-peek at the hit listings.  I found some other interesting stuff when I followed it through like a Wired News (www.wired.com/news/) animation page and some articles to read one of the Internet’s newer search engines named Scirus which looks through 167 million scientific Web pages and uses linguistic analysis to rank the results.  It also looks through unpublished research and information from university and corporate websites as well as internal office communications (conference minutes with agendas and mailing list archives).  If I was to pick a worst overall search engine I would pick the Librarian’s Index (www.lii.org) because it has a small database to look through and indirectly accesses the hit information.  The Internet is changing so rapidly that I think the Librarian’s Index will become outdated easily and there are better search engines out there now that go directly for the data.