Registration with Search Engines

by - Anshuman Sharma

What is a Search Engine ?
Search engines are one of the primary ways that Internet users find web sites.
Search engines create their listings automatically.
Alta Vista, HotBot, Web Crawler are popular Search Engines.
Directory listings are created by humans. Yahoo, DMOZ are directories with lot of listings.

Some Search Engines as Mamma and askjeeves.com do not have their own database, but they bring the top rankings from other search engines.

Robots : In case of search engines for the web, the "calling up" is done by indexing software agents often called - appropriately enough as - robots or spiders or crawlers. These agents are programmed to constantly crawl the Web in search of new or updated pages. They will essentially go from URL to URL until they have visited every Web site on the Internet. The robots or spider records the FULL text of every page within the site it is spidering. Robots or spiders keep coming to site to update their databases with regard to the changes in your web sites.

Directories : Humans do it better because their evaluation process is much better than that of a spider. Most "tricks" that work with spider do not work with humans. directory will not list your URL if you do not register it with them. Changing your web pages has no effect on your listing. The only exception is that a good site, with good content, might be more likely to get reviewed than a poor site.

Hybrid Search Engines: Some search engines maintain an associated directory. Being included in a search engine's directory is usually a combination of luck and quality. Sometimes you can "submit" your site for review, but there is no guarantee that it will be included. Reviewers often keep an eye on sites submitted to announcement places, then choose to add those that look appealing.

Why register in Search Engine ?
So the better your web site is search engine friendly .. the better your rank.. the more your visitors !!!


What do the Search Engines look for ?
Content determines your site rank.

The HEAD
The Title Tag :
<title>The Title</title>

The Meta Tags :
meta HTTP-EQUIV="" description=""> and
<meta NAME="" description="">
when HTTP-EQUIV is used the the information about the document is sent before the actual HTML file is transmitted.

Meta Description Tag
<meta name="description" content="description of web page">

Meta Keywords Tag
<meta name="keywords" content="keywords of web page">

Meta Robots Tag
<meta name="robot" content=" all|none|index|noindex|follow|nofollow">
INDEX means that robots are welcome to include this page in search services. FOLLOW means that robots are welcome to follow links from this page to find other pages.
So a value of "NOINDEX" allows the subsidiary links to be explored, even though the page is not indexed. A value of "NOFOLLOW" allows the page to be indexed, but no links from the page are explored (this may be useful if the page is a free entry point into pay-per-view content, for example. A value of "NONE" tells the robot to ignore the page.
eg.,

<META NAME="ROBOTS" CONTENT="ALL">
<META NAME="ROBOTS" CONTENT="index,follow">
<META NAME="ROBOTS" CONTENT="noindex,follow">
<META NAME="ROBOTS" CONTENT="index,nofollow">
<META NAME="ROBOTS" CONTENT="noindex,nofollow">

Copyright
Used to identify the copyright holders of the web page.
<META NAME="copyright" CONTENT="This page and its contents are copyright 1999-2000 by Anshuman Sharma. All Rights Reserved>

Revisit
Instructs proxy servers to re-cache your page after the specified time. Useful if your page content changes regularly. These meta tags don't tell search engine spiders to come back, they do that when good and ready.
<META NAME="REVISIT-AFTER" CONTENT="15 days">

Pragma
This is another way to control browser caching. To use this tag, the value must be "no-cache". When this is included in a document, it prevents Netscape Navigator from caching a page locally.
<META HTTP-EQUIV="Pragma" CONTENT="no-cache">

Meta Refresh Tag
<meta name="refresh" content="time;URL">
eg., <META HTTP-EQUIV="Refresh" CONTENT="0;URL=http://www.newurl.com">

Expires
This tag instructs the visitors cache to refresh after the number of days indicated in the content value. This can be set to "0" if you have regularly updated content and want each visitor to be presented with the most up to date content.
<META HTTP-EQUIV="expires" CONTENT="15">
eg., <META HTTP-EQUIV="expires" CONTENT="Wed, 26 Feb 1997 08:21:57 GMT">

Set-Cookie
<META HTTP-EQUIV="Set-Cookie" CONTENT="cookievalue=xxx;expires=Wednesday, 21-Oct-98 16:14:21 GMT; path=/">

Rating
This tag assigns a content rating to the page and can be set to general, 14 years, restricted or mature.
<META NAME="rating" CONTENT="general">

Window-target
<META HTTP-EQUIV="Window-target" CONTENT="_top">

Example
<head>
<title>Anshuman Sharma</title>
<META NAME="KEYWORDS" CONTENT="Anshuman, Sharma, NIIT, Web designer, graphics">
<META NAME="DESCRIPTION" CONTENT="Anshuman Sharma is working as a web page designer">
</head>

The Body
The web is a textually indexed medium
General content - ASCII text information
<frameset><noframeset>
Name attribute of various tags mainly the <a>tag and the <img>tag.

ALT attribute is the <img>tags attribute which gives alternate text to the image.

<h1>tags may be used.


What makes Search Engine Angry ?

Spamming of search engines
Repeating keywords in META TAGS
Repeating Keywords in BODY
Redirector Page - META REFRESH TAG
Bait Page
Repeated Registration of a Web page
Tiny Text
Invisible Text

Frames and Robots
It is a bad practice to have framesets as welcome pages.
Use the <noframes> tag.
Dynamic Pages annoy Robots - welcome page is welcome.cgi or default.asp.


Search Engine Ranking

simple system of of alphabetical ranking in its directory - Directory (dmoz.org). "relevant" and "comprehensive" content.

How <HEAD> influences the ranking ?
The Title Tag - <title>..<title>
Meta Description Tag
Meta Keywords Tag
Plural of a keyword in small case Directories - Title, Description and Keywords submitted to be same as the site

How <BODY> influences the ranking ?
General Content- few lines must be straight forward and talk about what the site is about
LINKS - "about us" vs "NIIT-the place to be"
Image links - keywords in the place of the alt text
Image Maps - give text links, give "alt" text to each of the map
Alt text for text links

How does your URL influence your ranking ?
"www.niit.com" vs "www.whoareyou.com/whereareyou/niit/"
If your URL changes, you will need to resubmit. Some search engines have Dead Link forms for you to fill out. Those that do not will drop the old URL from its records the next time it tries to visit your site at the old address and is unable to find it.

How does popularity influence ranking ?
"Direct Hit" for a particular keyword.
What is there in : Title , Description, URL

How does Content influence your ranking ?
Comprehensive content


Search Engine Features

Here is a list some of the main aspects of the search engines as a ready reference.

Web Crawler

Size
Spider Class
Meta Tag support
Frame support
Image Map support
Alt Text support
HTML Comments
Url Searching
Embedded Directory
Submission URL

< 10 Million Urls (approx.)
Shallow
Yes
No
No
No
No
Yes
Yes
Submit

AltaVista

Size
Spider Class
Meta Tag support
Frame support
Image Map support
Alt Text support
HTML Comments
Url Searching
Embedded Directory
Submission URL

125 Million Urls (approx.)
Deep
Yes
Yes
Yes
Yes
No
Yes
Yes
Submit

Infoseek

Size
Spider Class
Meta Tag support
Frame support
Image Map support
Alt Text support
HTML Comments
Url Searching
Embedded Directory
Submission URL

60 Million Urls (approx.)
Shallow
Yes
No
Yes
Yes
No
Yes
Yes
Submit

Hot Bot

Size
Spider Class
Meta Tag support
Frame support
Image Map support
Alt Text support
HTML Comments
Url Searching
Embedded Directory
Submission URL

95 Million Urls (approx.)
Deep
Yes
No
No
No
Yes
Yes
Yes
Submit

Lycos

Size
Spider Class
Meta Tag support
Frame support
Image Map support
Alt Text support
HTML Comments
Url Searching
Embedded Directory
Submission URL

38 Million Urls (approx.)
Shallow
No
No
No
Yes
No
Yes
Yes
Submit

Excite

Size
Spider Class
Meta Tag support
Frame support
Image Map support
Alt Text support
HTML Comments
Url Searching
Embedded Directory
Submission URL

30 Million Urls (approx.)
Shallow
Partial
No
No
No
No
No
Yes
Submit

 

Crawling Yes No Notes
Deep Crawl All but... Go  
Instant Indexing AltaVista
(pages appear within days)
Others   
Frames Support AltaVista, FAST, Google,
NLight
Excite,  Inktomi,
Go, Lycos
 
Image Maps AltaVista,
Go, NLight
Excite, FAST, Google,
Inktomi, Lycos
 
robots.txt All n/a  
Meta Robots Tag All but Excite n/a Google may not support -- checking
Link Popularity Helps Deep Crawl Inktomi, Lycos AltaVista, Excite, FAST, Go,
NLight
 
Learns Frequency AltaVista, Go, Inktomi, Excite, FAST, Google, Lycos, NLight  
URL Status Check  
Indexing Yes No Notes
Full Body Text All n/a Some stop words may not be indexed
Stop Words AltaVista, Excite, Inktomi, Lycos, Google FAST, Go, NLight  
Meta Description All but... FAST, Google, Lycos, NLight  
Meta Keywords All but... Excite, FAST, Google, Lycos, NLight  
ALT text AltaVista,
Go, Google, Lycos
Excite, FAST, Inktomi, NLight  
Comments Inktomi Others  
Stemming  
Ranking Yes No Notes
Meta Tags 
Boost Ranking
Go, Inktomi AltaVista, Excite, FAST, Google, Lycos, NLight  
Reviewed Status 
Boosts Ranking
Go AltaVista, Excite, FAST, Google, Inktomi, Lycos, NLight  
Link Popularity 
Boosts Ranking
AltaVista, Excite, FAST, Google, Go, Inktomi, NLight Lycos Very important
at Google
Direct Hit 
Boost Ranking
HotBot, Lycos Others  
Spam Yes No Notes
Meta Refresh AltaVista,
Go, Lycos
Excite, FAST, Google, Inktomi, NLight  
Invisible Text Others Excite, FAST, Google  
Tiny Text AltaVista, Inktomi, Lycos Excite, FAST, Google,
Go, NLight
 

Search Engine Sizes & details


KEY: GG=Google, WT=WebTop.com, AV=AltaVista, FAST=FAST, NL=Northern Light, EX=Excite, INK=Inktomi, Go=Go (Infoseek).
Sizes are as reported by each search engine and as of June 6, 2000.

 



How long does it take a Search Engine to list my site?
Many of the search engines take time to list a site. The approximate time it may take a search engine to list your site is:
1-2 weeks: Altavista, Infoseek
2-4 weeks: Excite, HotBot, Lycos, Webcrawler
6-8 weeks: Yahoo

 

Search Engine Registration Home
Tips and Tricks Section
Home

1