The syntax of the CREATE statements can range from the extremely simple to the complex, depending on the options your database management system (DBMS) supports and how detailed you want to be when building a database.
NOTE: The examples used today were generated using Personal Oracle7. Please see the documentation for your specific SQL implementation for any minor differences in syntax.
Through personal experience, we have learned the importance of creating a good SQL install script. This script file contains the necessary SQL code to completely rebuild a database or databases; the script often includes database objects such as indexes, stored procedures, and triggers. You will see the value of this script during development as you continually make changes to the underlying database and on occasion want to completely rebuild the database with all the latest changes. Using the graphical tools each time you need to perform a rebuild can become extremely time-consuming. In addition, knowing the SQL syntax for this procedure enables you to apply your knowledge to other database systems.
The syntax for the typical CREATE DATABASE statement looks like this:
CREATE DATABASE database_nameBecause the syntax varies so widely from system to system, we will not expand on the CREATE DATABASE statement's syntax. Many systems do not even support an SQL CREATE DATABASE command. However, all the popular, more powerful, relational database management systems (RDBMSs) do provide it. Instead of focusing on its syntax, we will spend some time discussing the options to consider when creating a database.
Most RDBMSs also allow you to specify a default database size, usually in terms of hard disk space (such as megabytes). You will need to understand how your database system stores and locates data on the disk to accurately estimate the size you need. The responsibility for managing this space falls primarily to system administrators, and possibly at your location a database administrator will build you a test database.
Don't let the CREATE DATABASE statement intimidate you. At its simplest, you can create a database named PAYMENTS with the following statement:
SQL> CREATE DATABASE PAYMENTS;
NOTE: Again, be sure to consult your database management system's documentation to learn the specifics of building a database, as the CREATE DATABASE statement can and does vary for the different implementations. Each implementation also has some unique options.
Normalization is the process of breaking your data into separate components to reduce the repetition of data. Each level of normalization reduces the repetition of data. Normalizing your data can be an extremely complex process, and numerous database design tools enable you to plan this process in a logical fashion.
Many factors can influence the design of your database, including the following:
Often the opposite problem can occur. You may have sought to completely normalize your tables' design with the database and in doing so created many tables. Although you may have approached database-design nirvana, any query operations done against this database may take a very long time to execute. Databases designed in this manner are sometimes difficult to maintain because the table structure might obscure the designer's intent. This problem underlines the importance of always documenting your code or design so that others can come in after you (or work with you) and have some idea of what you were thinking at the time you created your database structure. In database designer's terms, this documentation is known as a data dictionary.
NOTE: Most of the major RDBMS packages come with either the data dic-tionary installed or scripts to install it.
Assume you have three tables: BILLS, BANK_ACCOUNTS, and COMPANY. Table 9.1 shows the format of these three tables.
Bills | Bank_Accounts | Company |
NAME, CHAR(30) | ACCOUNT_ID, NUMBER | NAME, CHAR(30) |
AMOUNT, NUMBER | TYPE, CHAR(30) | ADDRESS, CHAR(50) |
ACCOUNT_ID, NUMBER | BALANCE, NUMBER | CITY, CHAR(20) |
BANK, CHAR(30) | STATE, CHAR(2) |
The primary key in the BILLS table is the NAME field. This field should not be duplicated because you have only one bill with this amount. (In reality, you would probably have a check number or a date to make this record truly unique, but assume for now that the NAME field works.) The ACCOUNT_ID field in the BANK_ACCOUNTS table is the primary key for that table. The NAME field is the primary key for the COMPANY table.
The foreign keys in this example are probably easy to spot. The ACCOUNT_ID field in the BILLS table joins the BILLS table with the BANK_ACCOUNTS table. The NAME field in the BILLS table joins the BILLS table with the COMPANY table. If this were a full-fledged database design, you would have many more tables and data breakdowns. For instance, the BANK field in the BANK_ACCOUNTS table could point to a BANK table containing bank information such as addresses and phone numbers. The COMPANY table could be linked with another table (or database for that matter) containing information about the company and its products.
Column Names | Comments |
NAME, CHAR(30) | Name of company that bill is owed to |
AMOUNT, NUMBER | Amount of bill in dollars |
ACCOUNT_ID, NUMBER | Bank account number of bill (linked to BANK_ACCOUNTS table) |
ADDRESS, CHAR(30) | Address of company that bill is owed to |
CITY, CHAR(15) | City of company that bill is owed to |
STATE, CHAR(2) | State of company that bill is owed to |
Before you actually fill these tables with data, you will need to know how to create a table.
CREATE TABLE table_name ( field1 datatype [ NOT NULL ], field2 datatype [ NOT NULL ], field3 datatype [ NOT NULL ]...)A simple example of a CREATE TABLE statement follows.
SQL> CREATE TABLE BILLS ( 2 NAME CHAR(30), 3 AMOUNT NUMBER, 4 ACCOUNT_ID NUMBER); Table created.
The following section examines components of the CREATE TABLE command.
NOTE: You can have duplicate table names as long as the owner or schema is different. Table names in the same schema must be unique.
Data Type | Comments |
CHAR | Alphanumeric data with a length between 1 and 255 characters. Spaces are padded to the right of the value to supplement the total allocated length of the column. |
DATE | Included as part of the date are century, year, month, day, hour, minute, and second. |
LONG | Variable-length alphanumeric strings up to 2 gigabytes. (See the following note.) |
LONG RAW | Binary data up to 2 gigabytes. (See the following note.) |
NUMBER | Numeric 0, positive or negative fixed or floating-point data. |
RAW | Binary data up to 255 bytes. |
ROWID | Hexadecimal string representing the unique address of a row in a table. (See the following note.) |
VARCHAR2 | Alphanumeric data that is variable length; this field must be between 1 and 2,000 characters long. |
NOTE: The LONG data type is often called a MEMO data type in other database management systems. It is primarily used to store large amounts of text for retrieval at some later time.The LONG RAW data type is often called a binary large object (BLOB) in other database management systems. It is typically used to store graphics, sound, or video data. Although relational database management systems were not originally designed to serve this type of data, many multimedia systems today store their data in LONG RAW, or BLOB, fields.
The ROWID field type is used to give each record within your table a unique, nonduplicating value. Many other database systems support this concept with a COUNTER field (Microsoft Access) or an IDENTITY field (SQL Server).
NOTE: Check your implementation for supported data types as they may vary.
When building a table, most database systems enable you to denote a column with the NOT NULL keywords. NOT NULL means the column cannot contain any NULL values for any records in the table. Conversely, NOT NULL means that every record must have an actual value in this column. The following example illustrates the use of the NOT NULL keywords.
SQL> CREATE TABLE BILLS ( 2 NAME CHAR(30) NOT NULL, 3 AMOUNT NUMBER, 4 ACCOUNT_ID NOT NULL);
The first statement in the next example inserts a valid record containing data for a bill to be sent to Joe's Computer Service for $25.
SQL> INSERT INTO BILLS VALUES("Joe's Computer Service", 25, 1); 1 row inserted.
SQL> INSERT INTO BILLS VALUES("", 25000, 1); 1 row inserted.
A good rule of thumb is that the primary key field and all foreign key fields should never contain NULL values.
You should notice several things when choosing a key field. As we mentioned, Oracle provides a ROWID field that is incremented for each row that is added, which makes this field by default always a unique key. ROWID fields make excellent key fields for several reasons. First, it is much faster to join on an integer value than on an 80-character string. Such joins result in smaller database sizes over time if you store an integer value in every primary and foreign key as opposed to a long CHAR value. Another advantage is that you can use ROWID fields to see how a table is organized. Also, using CHAR values leaves you open to a number of data entry problems. For instance, what would happen if one person entered 111 First Street, another entered 111 1st Street, and yet another entered 111 First St.? With today's graphical user environments, the correct string could be entered into a list box. When a user makes a selection from the list box, the code would convert this string to a unique ID and save this ID to the database.
Now you can create the tables you used earlier today. You will use these tables for the rest of today, so you will want to fill them with some data. Use the INSERT command covered yesterday to load the tables with the data in Tables 9.3, 9.4, and 9.5.
SQL> create database PAYMENTS; Statement processed. SQL> create table BILLS ( 2 NAME CHAR(30) NOT NULL, 3 AMOUNT NUMBER, 4 ACCOUNT_ID NUMBER NOT NULL); Table created. SQL> create table BANK_ACCOUNTS ( 2 ACCOUNT_ID NUMBER NOT NULL, 3 TYPE CHAR(30), 4 BALANCE NUMBER, 5 BANK CHAR(30)); Table created. SQL> create table COMPANY ( 2 NAME CHAR(30) NOT NULL, 3 ADDRESS CHAR(50), 4 CITY CHAR(30), 5 STATE CHAR(2)); Table created.
Name | Amount | Account_ID |
Phone Company | 125 | 1 |
Power Company | 75 | 1 |
Record Club | 25 | 2 |
Software Company | 250 | 1 |
Cable TV Company | 35 | 3 |
Account_ID | Type | Balance | Band |
1 | Checking | 500 | First Federal |
2 | Money Market | 1200 | First Investor's |
3 | Checking | 90 | Credit Union |
Name | Address | City | State |
Phone Company | 111 1st Street | Atlanta | GA |
Power Company | 222 2nd Street | Jacksonville | FL |
Record Club | 333 3rd Avenue | Los Angeles | CA |
Software Company | 444 4th Drive | San Francisco | CA |
Cable TV Company | 555 5th Drive | Austin | TX |
SQL> CREATE TABLE TABLENAME 2 (COLUMN1 CHAR NOT NULL, 3 COLUMN2 NUMBER, 4 COLUMN3 DATE) 5 TABLESPACE TABLESPACE NAME 6 STORAGE 7 INITIAL SIZE, 8 NEXT SIZE, 9 MINEXTENTS value, 10 MAXEXTENTS value, 11 PCTINCREASE value);
Table created.
CREATE TABLE NEW_TABLE(FIELD1, FIELD2, FIELD3) AS (SELECT FIELD1, FIELD2, FIELD3 FROM OLD_TABLE <WHERE...>This syntax allows you to create a new table with the same data types as those of the fields that are selected from the old table. It also allows you to rename the fields in the new table by giving them new names.
SQL> CREATE TABLE NEW_BILLS(NAME, AMOUNT, ACCOUNT_ID) 2 AS (SELECT * FROM BILLS WHERE AMOUNT < 50); Table created.
Some database systems also allow you to use the following syntax:
INSERT NEW_TABLE SELECT <field1, field2... | *> from OLD_TABLE <WHERE...>The preceding syntax would create a new table with the exact field structure and data found in the old table. Using SQL Server's Transact-SQL language in the following example illustrates this technique.
INSERT NEW_BILLS 1> select * from BILLS where AMOUNT < 50 2> go(The GO statement in SQL Server processes the SQL statements in the command buffer. It is equivalent to the semicolon (;) used in Oracle7.)
The ALTER TABLE command enables you to do two things:
ALTER TABLE table_name <ADD column_name data_type; | MODIFY column_name data_type;>The following command changes the NAME field of the BILLS table to hold 40 characters:
SQL> ALTER TABLE BILLS 2 MODIFY NAME CHAR(40); Table altered.
Here's a statement to add a new column to the NEW_BILLS table:
NOTE: You can increase or decrease the length of columns; however, you can not decrease a column's length if the current size of one of its values is greater than the value you want to assign to the column length.
SQL> ALTER TABLE NEW_BILLS 2 ADD COMMENTS CHAR(80); Table altered.
Several restrictions apply to using the ALTER TABLE statement. You cannot use it to add or delete fields from a database. It can change a column from NOT NULL to NULL, but not necessarily the other way around. A column specification can be changed from NULL to NOT NULL only if the column does not contain any NULL values. To change a column from NOT NULL to NULL, use the following syntax:
ALTER TABLE table_name MODIFY (column_name data_type NULL);To change a column from NULL to NOT NULL, you might have to take several steps:
2. Deal with any NULL values that you find. (Delete those records, update the column's value, and so on.)
3. Issue the ALTER TABLE command.
NOTE: Some database management systems allow the use of the MODIFY clause; others do not. Still others have added other clauses to the ALTER TABLE statement. In Oracle, you can even alter the table's storage parameters. Check the documentation of the system you are using to determine the implementation of the ALTER TABLE statement.
DROP TABLE table_name;Here's how to drop the NEW_BILLS table:
SQL> DROP TABLE NEW_BILLS; Table dropped.
WARNING: If you issueSQL> DROP TABLE NEW_BILLS;you could be dropping the incorrect table. When dropping tables, you should always use the owner or schema name. The recommended syntax isSQL> DROP TABLE OWNER.NEW_BILLS;We are stressing this syntax because we once had to repair a production database from which the wrong table had been dropped. The table was not properly identified with the schema name. Restoring the database was an eight-hour job, and we had to work until well past midnight.
DROP DATABASE database_nameDon't drop the BILLS database now because you will use it for the rest of today, as well as on Day 10.
NOTE: The various relational database implementations require you to take diff-erent steps to drop a database. After the database is dropped, you will need to clean up the operating system files that compose the database.
A data dictionary is one of the most important pieces of documentation you can create when designing a database. This dictionary should include a complete description of all objects in the database: tables, fields, views, indexes, stored procedures, triggers, and so forth. A complete data dictionary also contains a brief comment explaining the purpose behind each item in the database. You should update the data dictionary whenever you make changes to the database.
Before using any of the data manipulation statements, it is also important to create a good database design. Break down the required information into logical groups and try to identify a primary key field that other groups (or tables) can use to reference this logical group. Use foreign key fields to point to the primary or foreign key fields in other tables.
You learned that the CREATE DATABASE statement is not a standard element within database systems. This variation is primarily due to the many different ways vendors store their databases on disk. Each implementation enables a different set of features and options, which results in a completely different CREATE DATABASE statement. Simply issuing CREATE DATABASE database_name creates a default database with a default size on most systems. The DROP DATABASE statement permanently removes that database.
The CREATE TABLE statement creates a new table. With this command, you can create the fields you need and identify their data types. Some database management systems also allow you to specify other attributes for the field, such as whether it can allow NULL values or whether that field should be unique throughout the table. The ALTER TABLE statement can alter the structure of an existing table. The DROP TABLE statement can delete a table from a database.
A CREATE DATABASE varies because the actual process of creating a database varies from one database system to another. Small PC-based databases usually rely on files that are created within some type of application program. To distribute the database on a large server, related database files are simply distributed over several disk drives. When your code accesses these databases, there is no database process running on the computer, just your application accessing the files directly. More powerful database systems must take into account disk space management as well as support features such as security, transaction control, and stored procedures embedded within the database itself. When your application program accesses a database, a database server manages your requests (along with many others' requests) and returns data to you through a sometimes complex layer of middleware. These topics are discussed more in Week 3. For now, learn all you can about how your particular database management system creates and manages databases.
Q Can I create a table temporarily and then automatically drop it when I am done with it?
A Yes. Many database management systems support the concept of a temporary table. This type of table is created for temporary usage and is automatically deleted when your user's process ends or when you issue the DROP TABLE command. The use of temporary tables is discussed on Day 14, "Dynamic Uses of SQL."
Q Can I remove columns with the ALTER TABLE statement?
A No. The ALTER TABLE command can be used only to add or modify columns within a table. To remove columns, create a new table with the desired format and then select the records from the old table into the new table.
2. True or False: The DROP TABLE command is functionally equivalent to the DELETE FROM <table_name> command.
3. True or False: To add a new table to a database, use the CREATE TABLE command.
4. What is wrong with the following statement?
CREATE TABLE new_table ( ID NUMBER, FIELD1 char(40), FIELD2 char(80), ID char(40);
ALTER DATABASE BILLS ( COMPANY char(80));
7. If data in a character column has varying lengths, what is the best choice for the data type?
8. Can you have duplicate table names?
2. With the five tables that you have created--BILLS, BANK_ACCOUNTS, COMPANY, BANK, and ACCOUNT_TYPE--change the table structure so that instead of using CHAR fields as keys, you use integer ID fields as keys.
3. Using your knowledge of SQL joins (see Day 6, "Joining Tables"), write several queries to join the tables in the BILLS database.