SEARCHING THE INTERNET      Powered by Clarynx Technologies
Click here to listen to this story



You do a search.
You get 60,000 matches.
Sites 1-20 are shown.
Are those search sites really helping?




Contents:
1. Ignore the "Experts"
2. Use the Best Tool
3. Searching Newsgroups
4. Using Yahoo
5. Fun With Spiders
6. General Searching Tips
7. Customize Your Search Forms

- Deja News Search
- 3-in-one Search
Home
"So, you want to find something on Internet. Join the club."

PLEASE NOTE:I will be updating this page when I get the opportunity. Google is currently the best search engine for the web by a wide margin. Also, recently Google purchased Deja News and the Deja search interface no longer functions. When I figure out what is going to happen with that, I will update the search forms and this page. Sorry for any inconvenience.

But you also know that there is more information available out there than you can possibly handle, and most of it is just junk. Of course, there are tons of companies out there trying to make millions by offering you sites that will search all the sites on the Web and hopefully give you something useful back. But do they really help you? Is this really the best way to find the information you're looking for?

Maybe not. I do a lot of searching on Internet. For specific answers, for general web sites, for people, for software updates, and just about anything else I might want to know. Over the last 5 years, I've found what works and what doesn't, and how to get what I want as quickly as possible. Hopefully I can help you be a bit and turn your hours of 'browsing' into results. And there will be no sales pitches.

1. IGNORE THE "EXPERTS"

If you read the trade magazines, or read all the articles available on Internet, you probably think that the best way to find what you want is to load up excite.com, or webcrawler.com, or lycos.com, or altavista.digital.com, or all of the above. Or use some fancy search engine that tries to combine those results to give you even more to look through.

This usually doesn't make sense. Don't waste your time wading through 5,000 results to find one you really want. If all else fails, use these sites. But at least know how to get what you want, which is covered below. The people telling you that these are the best ways to find information are probably just as frustrated as you are.

2. USE THE BEST TOOL

The first step to finding what you need is to know what you're looking for and which tool helps find that information. There are three basic types of search tools on Internet: discussion group archives, human-sorted lists, and automated web spiders.

Discussion Group Archives
The advantage that 'discussion groups' in one form or another have is that they are people talking about a specific topic, asking specific questions, answering those questions, and giving a variety of opinions. Usually, when you have a question you aren't the first to ask it. It's probably been asked many, many times before. And probably answered at least once. So, take advantage of that fact and search old usenet postings. For those who don't know, Usenet is the biggest compilation of 'discussion groups' in the world, are distributed throughout the Internet, and are accessible to most people, whether they know it or not. There are separate groups for almost every topic you can imagine, and the discussions have been going on for many years.

Why search usenet?

  • It's specific: There are newsgroups dedicated to specific topics. By limiting your search to a specific set of newsgroups, you can guarantee that your search will only return results that are likely to be of interest to you.
  • It's interactive: It consists of people talking back and forth. One person asks a specific question, and another person answers that specific question. And you can join in on the discussions as well.
  • It's big: A lot of information is contained on usenet. Much more, IMO, than on the web. There are thousands of posts a day in thousands of newsgroups by thousands of people. The content changes daily and it's always up-to-date. No fancy graphics or interactive pages - just a lot of raw information.

What Usenet will usually not get you is general overviews or in-depth coverage of a certain topic. That is, if you're looking for anything and everything about sea turtles, searching usenet will probably not find you a nice, long document explaining it all. But, there is a good chances that someone has made a post asking where to find a good page like that, and someone else has posted the URL of a great site. So you can get the site you're looking for easily, since someone told you exactly where to look. And if you want a specific question answered, chances are better that you'll find it on usenet than anywhere else.

Human-sorted Lists
An example of a human-sorted list is Yahoo - one of the most popular sites on Internet. While these sites still just give you a list of other sites out on internet that may or may not have what you're looking for, the chances are better that you'll find something useful and find it fast.

Because every site in Yahoo is viewed by a human and put into specific categories that it belongs, you won't usually find 10,000 matches that have nothing to do with what you're looking for. The biggest advantage of a site like Yahoo, however, is that the listings have been divided into hundreds of categories in a hierarchical format, so you can start with a wide topic and narrow it down to sites that are in the area you are looking for.

Use Yahoo when you are looking for a specific company, or want to find a few good pages about a specific topic that is likely to be in their listing. If you're looking for web pages that have general information or summarize a particular topic area, it's your best bet. But, because it does not index by the actual words on the pages themselves, your searches are limited to the descriptions of the items entered in their database.

Automated Web Spiders
These are the big guys. AltaVista. Lycos. Webcrawler. Excite - just to name a few. They start at a page, grab all the text, and store it in their databases. They then follow all the links on that page to more pages, then those links, etc. Eventually they find their way around most of the Internet by following every link they can find.

The advantage that spiders have is that they index by words on the pages themselves. If I want to find every reference on Internet I can find about sea turtles, a spider will give me the best results (although it will probably be quite overwhelming). But, I may get pages about topics that have nothing to do with what I'm looking for, but happen to have the words 'sea' and 'turtles' in there somewhere.

Although spiders cover the most ground and can help you find a 'needle in a haystack' on Internet, they are usually not as helpful because it's harder to find sites that are specific to what you're looking for. Spiders fall prey to 'tricks' that put a person's web page higher on search results when they insert lots of keywords, for example. An automated spider isn't very good at knowing whether a page actually has content or if it's just a fake page to attract search results.

Use spiders when other methods have failed, or if you're interested in a topic and just want to find lots of pages where it might be mentioned. But be prepared to spend a lot of time looking at pages that do not help you at all.

3. SEARCHING NEWSGROUPS

Since Usenet is perhaps the biggest compilation of opinions and answers on Internet, it's fortunate that a few sites have been archiving these messages for a few years - and you can search them, of course. The only one that deserves mention is Deja News because it is clearly superior. It provides a lot of options, and lets you narrow down the search enough that you can get the posts you are looking for and the answers when they are there.

Usenet is often the most helpful in finding specific sites as well. If you want to find a web site with the best midi sound files, Deja News will help the most. Why? Because you're not the only one who is looking for a site like that. Other people have asked, and people have definitely answered, giving the URL's to the sites they think are the absolute best. Why search hundreds of web pages through a spider for the best one, when there are people who have already answered your question exactly?

Typing in a simple keyword to search Usenet may bring back hundreds or thousands of posts - not much more helpful than web spiders. So you need to narrow down your search to get what you're looking for. The first way to do this is to search on the subject line.

The subject line should summarize the post, or at least give a hint to what it's about. You can safely assume that a post with 'turtle' in its subject is a lot more likely to have real information on turtles than a post with the word 'turtle' somewhere in its text. So, the subject line helps to weed out which posts are really about a topic and which ones might not be. Try to insert at least one keyword in the subject search to get only very relevant posts first.

The best way to just get posts about the topic you are looking for, though, is to only search the appropriate newsgroups. Since there are groups devoted to almost every topic, finding the right one (or set of newsgroups) will help narrow your search very quickly. Fortunately, Deja News has a feature that allows you to put in some keywords and find which newsgroups have the most posts mentioning those words. If you aren't sure where people talk about 'sea turtles', a quick search on Deja News will tell you rec.scuba has a lot to say about it.

Searching Deja News can be a bit complicated, though. The syntax takes a while to get used to, and there are plenty of options that you usually don't need to use. Since I do many searches on Deja News every day, I've created a custom search form that has my favorite options built in and makes building a query much easier. I cover this a little more below.

So, lesson #1 is: search usenet first. Learn to use Deja News to find specific answers and you'll save yourself a lot of time.

4. USING YAHOO

Using Yahoo is easy, and most people have done it many times before. There isn't a whole lot to learn.

I only have two suggestions when searching Yahoo:

  • Make it general: Since sites on Yahoo do not have extensive descriptions, be pretty general when searching so you can find the categories that match closest. Then go to those categories and work your way down.
  • Search sub-categories: When you know you're looking for something about the WWW, don't search all of Yahoo. Instead, go into the WWW sub-category and search from there. Enter your words in the form at the top of the WWW page, and be sure to click on the 'Search only in the World Wide Web' button to limit your search to only sites listed below your current position. This will help narrow your search quite a bit.

Searching Yahoo shouldn't take long, which makes it an easy place to start. If you don't find what you need, move on.

5. FUN WITH SPIDERS

If you're ready to accept the challenge of playing with spiders, you need to pick a search engine you're comfortable with. I've been using Alta Vista since it first emerged, although I rarely find a need for spider searches anymore. There may be better search sites available, but I've been happy with Alta Vista.

When searching with a spider, start with a very specific search. Just like the subject line of a Usenet post helps identify which posts are really serious about a topic, the title of a web page can act in the same way. If a web page is really about sea turtles, it should at least have 'turtle' in its title. So only look for those pages to begin with. Alta Vista lets you do this by using a command like TITLE:"turtle". Learn to use this command and cut your search results down by hundreds or thousands.

In addition to using the TITLE tag, begin with a search of at least 4 or 5 keywords, and require that they all match. Since there are millions of pages on Internet, it's best to find one that fits your needs exactly if it is available. If you come up with too few matches, take out a keyword and search again to get some more options.

6. GENERAL SEARCHING TIPS

There are some searching tips that can be helpful no matter how you search or what you're looking for:

Choose Good Keywords
This is the most important part. It's not enough to just know what you're searching for - you need to know how people publish their information. If you're searching for information on Windows 95, entering 'windows 95' into a search engine won't be nearly as helpful as entering 'win95'. You need to know how people write, which is usually obvious but sometimes not. Be observant when you read Usenet or visit web sites and notice the words you find used most often, and which ones really help identify the specific topic.

If you're searching for a web site through Deja News, it's usually best to put "(web|url|site)" in your search, since this will limit results to posts where people are probably talking about specific web sites. Using "http*" will often find a web site as well, but usually they are in peoples' signatures at the bottom of their posts so it is not very helpful. Try not to match keywords that are common in signatures.

Be sure to enter different ways to find the same information. I was recently looking for a piece of equipment whose official name was "AVP-2030". So I did a search for "(avp2030|avp-2030|2030)" and found lots of matches - some from each keyword. Especially on model numbers and types of equipment, there are a lot of different ways to write the same thing, and using only one form will limit your results considerably.

Use wildcards when they are available. If you're looking for information about Windows 95 or NT, you could search for "win*" and that would match win95, winNT, windows95, windowsnt, windows, etc. This type of search in combination with a specific newsgroup will help narrow your search to something about windows.

If you want to find a really great site about a topic, try putting "best" in your search. Most people ask things like "What is the best site about xyz?" and the replies you get are often high-quality sites.

If you want to find a lot of information about a topic, look for the FAQ on the specific newsgroup. Just put "FAQ" in your subject line and you'll find some pointers to the Frequently Asked Questions on that topic. This may answer your questions you have now, plus the questions you'll have in the future.

Start Specific
When you start looking, go for exactly what you're looking for. If you find it, you're lucky. If you don't, just widen your search a bit by taking out the least-important keywords and seeing which new sites pop up. If you find that your specific search still has too many hits, narrow it down a bit further. One suggestion for Alta Vista is to use "AND NOT" to take out sites that are not what you're looking for. If lots of your matches are about Elvis and you really want Sea Turtles, put "AND NOT elvis" into your search string to take them out. Knowing what you don't want is as important as knowing what you do want.

Know What is Possible
Before doing a search to find out what the mating habits of the african mule are, ask yourself if this is a question that is likely to be answered on Internet. Perhaps people are talking about that on Usenet - if they are you are lucky. But searching for a web page to answer the question will probably not give good results. After all, who would want to spend their time make a page describing that? Is there really an audience for it? Is there anyone really excited about it? I hope not.

People may refer to the Internet as the "Information Superhighway" and people may think the information available is limitless, but it may be disappointing. The Internet still cannot replace the Encyclopedia or other sources of in-depth information, so don't be disappointed if you don't find your answer (and don't spend hours trying to find it).

Use all the Tools Available
Use various search engines, and know the benefits of each one. Pick the tool that fits the job. Find all the answers you can from one site, then try another to see if there is anything different. Don't give up because you didn't find what you wanted at Alta Vista. Try a few other sites to see if they've found something Alta Vista didn't.

7. CUSTOMIZE YOUR SEARCH FORMS

One thing I believe in strongly is to automate anything that is repetitive. If you do a lot of searches and find yourself entering the same information each time, or restricting the search in the same way, you're wasting time. You need to create a search form that works well for you and automates the things you normally do manually.

Every search engine the I know of allows you to create a form on your own site or your own hard drive that will submit its information to the search engine. Maybe you want the boxes arranged differently, or don't like the graphics, or whatever. The point is that you can control exactly what interface you have to the search engine, and if it's stored locally you don't have to wait for it to load across Internet each time you want to use it.

I've created two forms that help me dramatically when it comes to search speed. The first is specific to Deja News, and the second is 3 search forms in one - Alta Vista, Deja News, and Yahoo.

Form 1
For my Deja News search form I took the options that I never change and turned them into hidden variables whose values could not be changed. Then I don't have to worry about them or see them on the screen. I also wrote Javascript functions that actually create the complex query for me. I don't need to worry about using ~s, ~a, ~g, etc anymore. I just type in what I want and the form generates the query.

It also stores the names of the last 10 newsgroups I've specifically told Deja News to search. Since I often find myself searching the same group or groups, now all I need to do is click it from the list and it gets added to the query. If I search a new group, it gets added to the list. These are stored in cookies, so each time I load the page they are ready for me.

Finally, I have a 'browse' button for the newsgroups. If I'm not sure which group talks about what I'm looking for, I can just browse and submit the special query to Deja News to get my answer. Very simple.

I put the Deja News command syntax at the bottom, as well as an option to 'preview' the query that will be generated and sent for people who want to use the form but aren't familiar with how it works. I've also created Javascript alerts that pop up when each tag is clicked, with an explanation of what that entry box does and how it is used.

This form has increased my search speed considerably. I have it stored on my local hard drive and it is set to my 'home' button, so I can bring it up quickly and do a search with no wait time.

Form 2
The second form is the 3-in-one form that combines three search engines on one page and also removes the things that I never change. I also store this on my local hard drive for fast access.

Between those two pages, I've found that I very rarely need to go to a site on Internet to do a search. I have everything I need at my fingertips, and the forms fit my needs exactly. I've taken more control of the searches by organizing the forms how I actually use them.

Feel free to use my forms as an example for your own, or copy them to your own hard drive to use yourself. All I ask is that you give credit to me as the author of the Deja News form, since it took a number of hours to create and I think it's rather unique. Of course, you can take the same approach to any search engine on Internet - embed your favorite options, create a new look, and simplify to fit your needs. You'll find that it increases your productivity quite a bit.

THAT'S ALL FOLKS

Well, that's all there is to it. Go out and find what you need. Create your own search forms. Have fun!

This is my experiment with (purely voluntary) Micro-Payments. Basically, web users make very small cash contributions to the sites they find most useful. If many people do this, the web site operators can be rewarded without having to resort to banner ads, etc. This link will take you to PayPal where you can donate $2 to support my site (please... no need to donate more than once). This is purely voluntary and is in no way a "fee" for the information I put on my site. Thanks.

I write these articles to help people. Please let me know how I'm doing!
What did you think of this page?
Please rate it and drop me a quick anonymous note.
Please leave a quick comment and let me know your thoughts. All honest comments are appreciated. Leave your email address if you wish.


April, 1997
Copyright 1997, Matt Kruse <matt@mattkruse.com>
This document may not be reproduced in any way without the permission of the author