cadsbar/cadsbar.gif
cadsbar/yellow.gif
 
cadsbar/blank.gif
 

Module 3: Finding and Evaluating Information on the World Wide Web

Tips for Evaluating Search Engine Hit Lists

Search engines provide a fast and easy--perhaps too easy--way to find materials on the web. But no search engine is able to evaluate and make judgments about the materials it finds in the same way that human users can.

The following questions and discussions can help you to understand what you see in a search engine hit list and to make some critical judgments from what you see. These questions are arranged roughly from general to specific; from those you should apply to all or groups of items listed to those you must apply to each item individually.

Why are these items listed in the hit list?

If all the words or phrases in your search query appear in the Title of a web document--and, in particular, if they appear in exactly the same way in the Title, then these documents will be listed.

Unfortunately, the Title of a web document is not necessarily what most users expect it to be. For instance, the Title of this document--that is, the Title that a search engine would "see" if it looked at this document--is "CAD Center for Academic Development | Al Akhawayn University," and nearly all of the documents on the SSK 1203 web site (and throughout the CAD web site) have this same Title. This Title appears across the top of the browser window when users go to this web page, but it does not appear on the document itself. Also, this Title is set by the author/creator of the web document (in what is called a meta tag) and it can affect whether a document shows up in a hit list.

The words or phrases in your search query may appear anywhere within web documents, and they will also be listed. Here are just a few places where search engines may find search query words or phrases. You need to judge how each of the locations for finding the search text will affect the usefulness of an item:

  • Actual title (as it appears on the document itself), subtitle, heading, subheading of a document
  • First paragraph or introduction of a document
  • Anywhere in the body of a document
  • Last paragraph or conclusion of a document
  • Meta tags (like the Title, these are "hidden" areas of a document where the document's creator can store key words for search engines to find)
  • Links to other web documents (which may or may not be on the same web site)
  • The document's URL

Also, depending on the way in which you have phrased your query, the search engine may list documents that have only one, or some, of the words or phrases you have included. And, of course, these words or phrases may be in very different places--and widely separated--within the document.

Finally, remember that search engines can not distiniguish among words that are spelled the same but have many different meanings. For example, "right" may refer to the opposite of left, a political stance, a legal obligation, something correct, and so forth. Usually, only a human user can make this distinction from the context in which the word appears. Read carefully to decide.

NB: If you can not immediately tell from the hit list why an item is there, that does not mean it will be either useful or useless to you. You will need to investigate further before you are sure.

Why are these items listed in this order?

The order in which items appear in a search engine hit list depends on how that search engine ranks items or establishes their relevance. In general, the closer an item comes to accurately matching the search query you put in, the higher the item will be in the hit list.

However, some search engines also use other factors in their ranking. For instance, some search engines note how many other web pages link to a specific document, and use this in calculating their rankings. (The logic is that, if many pages link to a specific document, it must be of higher value than a document that is not linked to.)

Finally, some search engines return results that include "sponsored links." These are links to materials or sites that may be useful. However, the owners of the linked materials have usually paid to get their materials listed--either in a higher location or even a special window--in the hit list.

NB: If you can not immediately tell from the hit list why an item is listed in the order in which it appears, that does not mean it will be either useful or useless to you. You will need to investigate further before you are sure.

What alternative words, phrases, or ways of searching do these items suggest?

Look carefully at hit lists to find synonyms, related terms, more specific--or more general--terms or phrases that you could search for.

Also, in some cases, the best way to identify words or phrases that you want to avoid or exclude in a search is by analyzing a hit list. These may be words or phrases that occur with one meaning of a word--a meaning that you are not searching for.

What does the linked text tell me?

Most search engine results pages identify individual items with some text, which may be the document's title, that creates a hyperlink to the document. Remember: This linked text may be a descriptive title of the document's contents, or it may not be.

What does the "snippet" tell me?

Most hit lists give a short piece of text (called a snippet) from the document. The snippet may come from anywhere within the document, and it may tell you much information or nothing. Here are just a few situations to consider in making your judgement based upon a snippet.

If the search terms or phrases appear in the linked text (the document's title), but they do not appear anywhere in the document itself, the snippet you see might be a part or all of the first sentence (something beginning with a capital letter and ending with a period) of the document. The search engine's logic is that this sentence is likely to tell users something about the content of the document. Only a user can judge if this is true, and if the document is helpful.

If the search terms or phrases appear in the linked text (the document's title) and they appear elsewhere in the document itself, the snippet you see might be a part--or parts--of one or more sentences where the search terms are found. The search engine's logic is that showing users the words in their context will help them determine if the document is worthwhile.

If the search terms or phrases appear in the linked text and in the exact same way and in several other locations throughout the document, then the snippet may show you several examples of this "exact match" so that you will be aware of the frequency of the matches. (This situation may not, however, increase the ranking of this item in the hit list! And it may simply show you that something you are not interested in appears many times in the document!)

Finally, remember that a snippet can only show a very small context. Before you can decide that a document is exactly what you are looking for, you will need to examine it in more detail.

What does the URL tell me?

The URL or address of a web document can reveal some information about the source of the document, which, in turn, can help users make some predictions about the usefulness of the document. But, you must know how to read a URL to get this information. Here is an explanation, based on the URL for this web document:

http://mail.alakhawayn.ma/~A.Cads/1203/READINGS/M3READ/M3_hit_lists.htm

Element What it is What you may predict from it
http:// Type of transfer protocol; that is, the way it is "shipped" to users Not much; most documents are retrieved using http:// or ftp://, so there is not much to tell from this.
mail.alakhawayn.ma Domain name of the location where the document is stored. These are read from right to left, with the right-most being the top-level domain. The top-level domain, .ma, indicates that this document is stored in Morocco. (Some top-level domain names indicate the type of activity that the owner of the domain engages in, or the physical location, or both.) The alakhawayn indicates it is at AUI. (User must know, however that AUI "owns" the domanin name alakhawayn.ma as well as the domain name aui.ma to predict this.) The mail indicates it is on a server named "mail," and that there are likely to be other servers at AUI with other names. (Yes, there are others, but not all of them contain web documents.)
~A.Cads Username of the document owner; that is, the userid of the person/group that controls the space on the server in which this document is stored. (Usernames on the mail.alakhawayn.ma server contain a "first initial" followed by a period and a "last name." In this case, the user is not an individual, but the username was created to fit these rules for usernames.) The document resides within a web site "owned" by a user named A.Cads who is granted space on this server. Only some URLs reveal this information, but the appearance of the tilde (~) is a very strong hint that what follows identifies an individual user on a server. In this case, the "user" is actually the Center for Academic Development.
/1203/READINGS/
M3READ/M3_hit_lists.htm
Path to the specific document The document is stored inside a series of folders in the CAD area of the server. The file extension .htm (or .html) indicates a document saved in hypertext markup language--the standard for web documents. (An experienced user of the web would correctly predict that there might be many other documents on this site because it has so many folders and subfolders to organize them. But the number of documents does not necessarily give any hints about usefulness.)

Even very skilled and experienced users, those who have spent hundreds of hours working with search engines and web sites, may not be able to accurately predict the usefulness of any given document by taking apart its URL. However, this information does help them to create their own human ranking of the results delivered by a search engine.