[] The Internet is a many-headed beast: it enables users all over the world to communicate with each other, to play games online, to surf the Web or plug in to music and movies. It also happens to be the world's largest and most-dynamic database, containing information on virtually everything under the sun. The World Wide Web today consists of something between 200 and 250 million pages, with hundreds more springing up every day !
Now, while this is all fine and dandy, an interesting and important question arises: how does one cut through this information overload, filter out irrelevant material and identify relevant and useful data ? And the answer to this question [ because, like princesses and frogs, questions and answers *must* go together ] is a search engine, which will scour the Internet for the information you require and present it to you, neatly sorted and tabulated !
So this week, we're going to tell you a little about how these magical creatures work, and also take them for a little test-drive so see if they are really worthy of the glorious send-up we gave them two sentences ago :-)
[] First of all, let's talk a little about the various types of search engines :
Q. How many lawyers does it take to change a light bulb?
A: You won't find a lawyer who can change a light bulb.
Now, if you're looking for a lawyer to screw a light bulb...
<>---------------- i n t e r r u p t u s ------------------<>
[] Directories: These are not search engines per se, but rather databases of Web sites, classified into different categories. Human intervention is usually required for the classification part of the process. Yahoo! is an example of a directory. The human role here can sometimes result in more focused results, as compared to a spider.
[] Metasearch engines: These are engines that process queries on more than a single engine at a time. For example, a query for "Italian recipes" on a metasearch engine like Dogpile would display results from AltaVista, Infoseek, Webcrawler, Lycos and other engines. The advantage of a metasearch engine is that the user does not need to remember the specific search syntax for each and every engine he uses; all he needs to know is the syntax for the metasearch engine, and the metasearch engine automatically sends a correctly-formatted query to the others.
[] Hybrids: Many search engines also include their own directories; for example, Infoseek has its own directory of Web sites, indexed and catalogued into different categories. Such engines are called "hybrids" as they display the features of both spiders and directories.
[] Accurate search engines: The last type of engine, this one is yet to be found anywhere in cyberspace; and it seems that we will have to wait a few more years for it to evolve ;-)
I have every sympathy with the American who was so
horrified by what he had
read about the effects of
smoking that he gave up reading. -Henry G. Strauss
<>------------------- u n q u o t e ----------------------<>
Search engines consist of the following 3 components:
[] The spider: The spider is a program designed to look for Web pages, follow each and every link on those pages and all subsequent links and send this information back to the engine's index. Spiders "crawl" Web sites at periodic intervals, perhaps every two or three weeks.
[] The index: The index is a database maintained by each engine, and contains the information sent back by the spider on its winding journey through the Web. This index is updated dynamically as the information on the Web changes. And even if a page exists on the 'Net, but does not form a part of the index, it will not show up in a search on that particular engine.
[] The search processor: This is the software which actually converts your query into a suitable format for processing, looks it up in the index and ranks and presents the results generated in HTML format on a Web page.
By contrast, the procedure at a directory is much simpler: Web sites are either submitted to the directory owner, or the administrators themselves review sites and classify them appropriately in the database. This lack of automation has both advantages and disadvantages: on the upside, queries are likely to return far better results, but at the same time, it takes much longer to get your site entered in the database.
Hacker intrusions into government computers
are detected
only 10% of the time
<>--------------------- s t a t s ------------------------<>
[] No search engine can index the entire World Wide Web. Show us an engine which claims to do that, and we'll show you a liar! ;-)
[] Every search engine has its own peculiar syntax, so you need to be familiar with the syntax of your favourite search engine. Metasearch engines eliminate the need for this, and many of them even have natural-language search capabilities, which makes things much simpler !
In our travels across cyberspace, we have come to rely on AltaVista, Infoseek, Lycos, Yahoo! and Dogpile for all our needs. A search carried out across all these five engines invariably gives us what we need; and, in particular, we like AltaVista, which has to be one of the most powerful and flexible engines in existence !
monadnock (muh-NAD-nok) noun
A mountain or rocky mass that has resisted erosion
and stands isolated in an essentially level area.
<>----------------------- p l a y -------------------------<>
[] Search engines:
AltaVista [ http://www.altavista.com ]
Lycos [ http://www.lycos.com ]
Hotbot [ http://www.hotbot.com ]
Infoseek [ http://www.infoseek.com ]
Webcrawler [ http://www.webcrawler.com ]
Excite [ http://www.excite.com ]
[] Directories:
Yahoo! [ http://www.yahoo.com ]
Netcenter [ http://www.netscape.com ]
Infoseek [ http://www.infoseek.com ]
Lycos [ http://www.lycos.com ]
[] Metasearch engines:
Dogpile [ http://www.dogpile.com ]
Metacrawler [ http://www.metacrawler.com ]
Ask Jeeves! [ http://www.askjeeves.com ]
[] And now for our test:
We suddenly needed to find information on something called DQPSK or differential quadrature-phase shift keying, a communications protocol used in cellular technology. So we conducted a search on AltaVista, Infoseek, Yahoo!, Lycos and Infoseek for the terms "DQPSK gigital communications FAQ".
And then we thought we'd see if we could find some good-looking digital art on the Net...so we looked for "abstract digital artwork royalty free". And we measured relevance on a scale of 1 to 5, 1 being the least relevant.
We found that AltaVista and Infoseek provided us with fairly good links on both cases, while the rest failed quite miserably ;-)
Digital art | DQPSK | Relevance score | ||
Altavista | 3,45,321 hits | 1,70,364 hits | 4 | |
Infoseek | 2,22,94,939 | 96,27,580 | 4 | |
Yahoo! | 5,86,660 | 1,34,564 | 2 | |
Hotbot | 179 | 9 | 2 | |
Lycos | 98 | 2 | 1 |
[] In case you're curious and would like to know a little more about how search engines work, we can recommend a great site at http://www.searchenginewatch.com. Search Engine Watch contains a large amount of information useful to both the webmaster and the novice about how search engines rank pages, how best to promote your site and much more !
[] And if you have voyeuristic tendencies, then you *must* visit Web Voyeur [ http://www.webcrawler.com/SearchTicker.html ], a site which allows you to watch what others are searching for online. We guarantee that some of the queries you will see will leave you scratching your head in amazement :-)
Hating people is like burning down your own house to get rid of a rat. - Henry Emerson Fosdick
<>------------------- u n q u o t e ----------------------<>
The search term was quite simple; the three letters "sex"...and the results quite remarkable !
AltaVista found 14,303,380 matches
Infoseek found 3,773,881 matches
Yahoo got 234 matches
Hotbot found 1,729,284 matches
Webcrawler found 35,201 matches
and Excite, 584,942 !
[] And while we're quite sure that this says something important about the human race, we're not too sure just what it means...:-)
Till next time, stay healthy !
- V & H
A helium-filled balloon is tied to the floor of a car that makes a sharp right turn. Does the balloon tilt while the turn is made? If so, which way? The windows are closed so there is no connection with the outside air.
Write in and tell us what *you* think the solution is. We will provide you with a correct solution in our next issue, together with the names of everyone who got it right. When you write in, do tell us how much information you would like us to disclose about you.
And the solution to last time's puzzle :
Almost all of it. Tie the ropes together. Climb up one of them. Tie a loop in it as close as possible to the ceiling. Cut it below the loop. Run the rope through the loop and tie it to your waist. Climb the other rope (this may involve some swinging action). Pull the rope going through the loop tight and cut the other rope as close as possible to the ceiling. You will swing down on the rope through the loop. Lower yourself to the ground by letting out rope. Pull the rope through the loop. You will have nearly all the rope.
----------> L S D f o r t h e b r a i n <--------------
You can (un)subscribe (from)to this newsletter in one of two ways :
The preferred method is to send us email at hitg@usa.net, asking us to add/remove you to/from the list.
Alternatively, you could use our website at http://www.psynet.net/hitg or http://geocities.datacellar.net/SoHo/Museum/3621. If the server is busy, don't lose heart...it probably isn't because of your looks :-)
If you decide not to subscribe, you will not hear from us again...and what a loss that would be, wouldn't it ?!
We know our website sucks. If you have some ideas as to how we could improve it, we want to hear them ! Send us some mail.
Comments, flames and suggestions are welcome !
Our primary email address for all communication is hitg@usa.net. However, on rainy days, it may not work :-)...in such a situation, you could also contact us at hitg@psynet.net
Archives of previous issues are available at http://geocities.datacellar.net/SoHo/Museum/3621/archive.html
The content in this newsletter is copyright Vikram Vaswani and Harish Kamath, collectively knows as HITG, Inc. !
All trademarks are copyright their respective owners.
For WordPlay, we thank http://www.wordsmith.org
For netStats, we thank http://www.emergeonline.com
Logic puzzles, quotes, and other goodies are freely available off the Internet.
© 1998 HITG Inc.
A seminar on Time Travel will be held two weeks ago.