Lesson 1 -- Printer Friendly Version

INSTRUCTIONS:

IMPORTANT: This version of your lesson is for saving or printing only. All links and images have been disabled to decrease download time and help you avoid printer difficulties.


Chapter 1

The Internet has been built upon two fundamental protocols (rules) known as the Transmission Control Protocol (TCP) and the Internet Protocol (IP).

The Internet Protocol guarantees that each machine connected to the network has its own unique address, consisting of four numbers separated by periods (like this: 123.45.6.78). Internet users can then specify this IP address in order to route information to a particular destination.

The Internet also offers the very popular domain name service, which can assign an easy-to-remember name (such as amazon.com or whitehouse.gov to an IP address. A domain name is just a nickname--the IP address is the true address. Routers on the Internet contain lookup tables that convert the domain names you type back to IP addresses so that the server you seek can be located.

The Transmission Control Protocol is a set of rules designed to ensure that the streams of data emanating from your computer are broken down into small packets. These packets then flood across the Internet, each seeking its own path, until they are reassembled, in order, at their destination.

TCP and IP are said to be connectionless and open.

They are connectionless in the sense that no direct connection is made between source and destination. Data packets traveling across the Internet have no predetermined path. Each data packet is free to seek the best route to its destination.

The protocols are open in the sense that no one commercial vendor holds the right to support or extend the protocols. Instead, volunteers across the Internet draft proposals called Requests for Comment (RFCs). The RFCs are then debated and accepted or rejected in an online forum at the following address:

www.cis.ohio-state.edu/hypertext/information/rfc.html

Chapter 2

Additional protocols have been layered atop TCP/IP to extend the Internet's capabilities and/or enhance the Internet experience.

The Simple Mail Transport Protocol (SMTP) and the Post Office Protocol (POP) are a set of rules that allow you to send and receive e-mail.

The File Transfer Protocol (FTP) allows you to upload and download both text and binary files across the Internet. The Network News Transport Protocol (NNTP) allows you to access Internet discussion areas known as the Newsgroups.

HTTP
The most exciting development in the history of the Internet came in 1991 when physicist Tim Berners-Lee implemented the HyperText Transport Protocol (HTTP). HTTP is designed to allow simple and swift transfers of multiple forms of media: text, images, animation, sound, and video. An HTTP session involves two computers: 1. An HTTP server, which is responsible for storing and delivering the text, images, sound and video files; and 2. a client, which is the name we assign to any computer that requests and later receives those files from the server.

When you surf the web, your computer is the client. The machines you request information from (such as http://www.yahoo.com/ or http://www.sba.gov/) are the servers. As a web developer, you probably have an HTTP server of your very own. You are probably renting this server from a hosting service and are using it to store all of your web pages, images, and other files.

You are not alone. The Internet's vast collection of intricately interconnected HTTP servers, which number in the hundreds of thousands, are collectively known as the World Wide Web.

There are two important points you should remember about HTTP:

1. It is stateless, which means that once a server has responded to a client's request, it can drop the connection to the first client and serve another.

The server deals with each request independently; it has no memory of previous requests. This leads to a very rapid and efficient system for distributing information.

Here's how it works: the client sends a request, the server responds, the transaction is ended, and the server is free to deal with another client.

This stateless mode of communication ensures that your server is not tied up while the client's operator slowly types in another request.

2. Like TCP/IP, the HTTP is open. An organization called the World Wide Web Consortium (W3C) debates proposals for changes to the protocol in forums that are open to the public at the following address:

www.w3.org/pub/WWW/

Chapter 3

Although the Hypertext Transport Protocol is a relatively rapid and efficient means of distributing requested information to a client, it is a relatively static protocol.

What I mean by static is unvarying. An HTTP server waits for a document to be requested and sends it. The document it sends is always the same, regardless of who requests it or when the request is made.

The server cannot vary the appearance of the document in any way. It cannot keep track of how many times a document has been requested, who requested it, or when it was requested.

The server cannot keep track of which documents a client has already requested, nor can it build custom documents based on a client's needs.

An HTTP server has no capabilities that would allow a client to search for a particular document or manipulate a database, and it cannot accept any information from the client, other than the name of the document the client needs.

Enter CGI, which was developed to make the World Wide Web dynamic. In short, CGI allows an HTTP server to run programs which, in turn, extend the server's ability to interact with the client.

Such programs might: -count the number of requests for a page

-collect and store information from a visitor

-allow a user to search a database

-generate custom web pages on the fly based on the user's stated preferences

-display a table or graph

-automatically send an e-mail message to each visitor

the tasks these CGI programs can perform for a server are limited only by our imagination.

-What does CGI stand for-

CGI is an acronym that stands for Common Gateway Interface. Please allow me to briefly define the three words from whence the acronym has been derived:

Common
This word means that CGI is platform-independent. CGI programs don't care whether your client has an Apple Iic or a Macintosh, a PC running DOS, a Pentium running Windows 95 or NT, a workstation running UNIX, or a 486 running OS/2. CGI programs are common to all computer platforms and operating systems. Any client computer that can access your server is going to be able to trigger your CGI programs.

Gateway
This word means that CGI programs run through a middleman, or gateway. Your server is the gateway to your programs.

Under CGI, your clients do not have direct control over your CGI programs. All they can do is ask your server to run a program that you have stored on the server. The server injects itself between your client and your program.

Once the client's request has been accepted, all communications between the client and the server are suspended temporarily. The server runs the program independently and only returns the results to the client when it is finished.

Your client has absolutely no control over the program while it is running, and they cannot run your program on their computers. The server stands resolutely between client and program.

Interface
This word simply means that CGI programs can act as an interface (communication channel) between your clients and your server.

Chapter 4

Under CGI, the flow of information between client and server involves up to four steps:

Step One
the client sends a request to the server

Step Two
the server parses (examines) the request. If a web page is requested, the server will send the file(s) that make up that page and drop the connection with the client.

If a CGI program is requested, the server starts and runs the program.

Step Three
If, with its request, the client passed any additional information to the server (see data passing methods below), the program processes this additional information, generates a customized response, and delivers it to the server.

Step Four
the server sends the response to the client and drops the connection with the client.

Chapter 5

One of CGI's most exciting features is it brings true interactivity to the web. While it is true that a client cannot influence a program once it has started running, it is possible for a client to initially exert some influence over the program's future course of action.

A client does this by passing data to your program through the server. The client must do this before the program is run. The data must be passed from client to server at the same time the request is made to run a CGI program.

There are two methods a client can use to pass data to a program through the server:

-The GET method-

The GET method is used to pass information to the program with the URL.

A URL, also known as a Uniform Resource Locator, is that long string of letters, numbers, periods, colons, and slashes you have to type in the little address or location box at the top of your browser. You type a URL whenever you want to request one or more files from a server. It usually looks something like this:

http://www.bigcompany.com/filename.ext

Suppose, for example, that you wanted to send your instructor's first name (Craig) to a program called namegame.cgi housed on a server named www.whoops.com so that the program could store that information in a field called Name.

A field is simply a description you assign to a type of information. If you've ever set up an Excel spreadsheet or Access database, you probably typed a descriptive name above each column of words or numbers. That descriptive name is called a field.

If you wanted to use the Get method to force a CGI program on a server to store my name in a field , you would need to fire up a web browser and type something like the following into the little 'URL' box at the top of the browser:

http://www.whoops.com/namegame.cgi?Name=Craig 
Now, I realize that asking your average web user to type http://www.whoops.com/namegame.cgi?Name=Craig correctly is a lot to ask for. You might want to take some of the onus off of the user by hard-coding a link to the program on a web page:

<A HREF="http://www.whoops.com/namegame.cgi?Name=Craig">
Submit
</A>

If a user clicks the word 'Submit', their web browser will automatically type the desired URL into the browser's location box. With little more than the click of a mouse, the client can send the word Craig to a program named namegame.cgi on a server named www.whoops.com to be stored in a field called Name.

But what if the user wants to pass a name other than Craig to the program? Although this technique for passing data to the CGI program will be easy for the user to implement, all interactivity has been lost. There is no way for the student to change the value that the program will be storing in the Name field.

On a web page, you can use forms to pass data from a client to a CGI program with the same point and click simplicity that we had with the <A> tag, but none of the restrictions.

If you correctly use the <form> tag on your web page, the user should have every opportunity to pass whatever data he or she desires to the CGI program.

You will learn how to create forms that interact with your CGI programs in lesson 5.

-The "POST" Method-

There is a big problem with the "GET" method of passing data from a client to a program through a server: most servers cannot handle URLs longer than 256 characters.

If a client using your form tries to enter too many characters, they could overwhelm the server ability to absorb information. The server will then be unable to pass all of the data your client has typed to your program. This could cause your program to misbehave or even crash.

Don't think it won't happen: the Internet is filled with hackers, who try to crash your program intentionally; and clueless newbies, who simply don't have any idea what they are doing. Both can bring you serious grief. Another method we can use to pass information from a client to our CGI program through a server is called the POST method. Data that reaches the server using this method is not sent in the URL, but as a separate message file known as standard input, or stdin.

The "POST" method is the preferred method for transferring information from a client and a CGI program because it places no limit on the number of characters that can be sent.

Chapter 6

When you feel you have a grasp on all of the concepts taught within this lesson, I would like you to take a short, multiple choice quiz. To get to the quiz, click on 'Quizzes' on the menu bar. When the form comes up, input your last name, e-mail address, and password, and make sure you have 'Quiz 1' selected. Good luck!

IMPORTANT: This version of your lesson is for saving or printing only. All links and images have been disabled to decrease download time and help you avoid printer difficulties. Do NOT attempt to click any of the links on this page.

Copyright 1999 by ed2go.com. All rights reserved.
No reproduction or redistribution without written permission.

1