Dean's Den |
|
A chapter I wrote on "System Software" in 1984 In 1984 I wrote a chapter called "System Software" for a new
Prentice-Hall textbook entitled Using Computers.
All of the contributors were told to use examples refering to
a fictitious "Mr. Williams" and "Ms. Jones."
Mainframe System Software Mr. Williams found learning about system software to be
a rather trying experience, but he was much better off than
his sister-in-law, Ms. Jones. System software in the
mainframe world is much more complicated than in the
microcomputer world. Fortunately, Ms. Jones needed little or
no expert knowledge of it in order to deal successfully with
her data processing department. Many layers of support
personnel acted as a buffer between her and this aspect of
the computer system.
The reasons for this complexity are various. When a
computer is running, many things are going on. At the center
is the control program or supervisor. This program, which is
packaged with other standard programs by the computer
manufacturer to make up an "operating system", has direct
control of the computer. It decides which other programs to
start and stop. These other programs include general purpose
programs that might be useful to anybody using the computer,
as well as programs that apply to a particular business or
situation. These latter are termed "application programs" to
distinguish them from "system programs", and are usually
developed specifically for the company that has purchased
the computer system.
Application programs may themselves run programs that
are part of the operating system. Every time a program reads
information out of a disk file, it must run a special system
program called an "access method". If it communicates with a
user through a terminal it will use telecommunications
software that probably comes with the system.
Multi-Programming, Multi-Processing Probably the biggest complicating factor in system
software is simply the large number of computing tasks going
at at the same time. If a program running in a microcomputer
is like a researcher reading a book in a library, the
program mix churning in a mainframe resembles the floor of
the New York Stock Exchange.
All computers require a control program, utilities, and
application software. What makes the job of a mainframe
operating system more complex is that there are always many
programs going on at the same time. The computer doesn't
perform separate operations simultaneously, but interleaves
the instructions from many programs in the most efficient
way. It is like a one-man band played with one hand, but so
rapidly that it sounds like a symphony. The capability to
run more than one program at the same time in this way is
called multi-programming.
Another technique for increasing the work flow is
called multi-processing. "Processing" here refers to the
fact that a computer consists of a central processor
connected to main storage. In large mainframe systems there
is often so much main storage that most of it lies idle most
of the time. By attaching a second "co-processor", the
computer can simultaneously run different programs in the
same main storage as long as they don't interfere with one
another.
Mainframes must support a large number of users
simultaneously, either through multi-programming,
multi-processing or both. Many of these users communicate
and share data with other users in the system. Many require
special peripherals. Some may communicate with other
computer systems over telecommunications lines. All of these
functions require expensive hardware and complicated system
software.
Monitoring the use of these resources becomes a
constant preoccupation of the system operators overseeing
the work flow. It was natural for system software to evolve
to simplify and partly automate this job.
An important area of system software links the computer
with terminals at which users and programmers can work. In
MVS, the most powerful operating system, terminals usually
connect using either the Time Sharing Option (TSO), or the
Customer Information Control System (CICS). In Ms. Jones'
company, TSO is used by programmers to write and test
programs, and CICS serves the users in the various
departments who need online information.
The operating system must not only provide numerous
services to the programs running under it, but it must also
see that the programs and sub-systems "get along" with one
another. It needs to prevent the locking out of specific
records from shared files. It must prevent the lock-out of
tape or disk drives. It must make sure that unique
peripherals such as optical character readers and laser
printers are available to top priority programs.
Computer peripherals in a modern data processing
installation are likely also to use a greater variety of
media and employ a wider range of storage strategies than in
a microcomputer environment. In addition, whereas Mr.
William's system was manufactured by a single company, the
computer room in Ms. Jones' company includes peripherals
from over a dozen separate manufacturers, and fills an
entire floor of a large downtown building. The software
required to control sophisticated peripherals such as the
Mass Storage System (MSS), for example, may be several times
as complicated as MS/DOS in its entirety.
Online Systems, Real-Time Systems, and Time Sharing Mr. Williams would probably be amused to learn that he
can do many things on his small computer that were
considered advanced functions for large computers just a few
years ago. For example, in the days when most processing
occurred on batches of data without human intervention, it
was difficult for someone to walk up to a computer and "ask
it a question". Online systems quickly evolved, however, to
give instant users access to current data, in just the same
way that Mr. Williams can turn his computer on and get to
any information he wants.
Real-time systems were even more prompt. A system was
only considered to operate in "real-time" if it could answer
a question fast enough to make an immediate difference. An
autopilot in a fighter jet, for instance, is actually a
real-time computer system. In the mainframe world, a typical
real-time system might be the software that examines a
charge card before dispensing money from a cash machine.
When more and more people needed access to online
systems, system software had to be developed which would
allow the sharing of valuable computer time. Time sharing
became an important capability of a large computer. It had
to give users of equal rank equal time, and users of first
rank top priority.
System Generation As Manager of Personnel, Ms. Jones was on the committee
to select a new security system to restrict employee access
to computer areas. In the system that was selected, an
employee would insert a special identification badge into a
mechanical card reader. This reader, which was attached to
the mainframe computer, would transmit the employee's
identification number to software which would check to see
if the employee was still authorized to enter that area.
Since this verification operated in real-time, the program
that did the checking operating at very high priority.
On the morning the new system was to go into effect,
however, Ms. Jones was surprised to find that it was not
working. That afternoon she learned that the badge reader,
although properly attached to the hardware, was not yet
known to the operating system. She also found out that to
properly support the new device a new "system generation"
would have to be performed that weekend.
To understand what it means for a mainframe operating
system to be "generated", let's remember Mr. Williams'
situation. When he unpacked his microcomputer, he found a
diskette containing the operating system called MS/DOS. He
put the diskette in the diskette drive and turned the
computer on. The computer automatically read in the software
and began running MS/DOS. This initial program load, or IPL,
occurred each time the computer was turned on. Even though
Mr. Williams later made a copy of the MS/DOS diskette and
used the copy rather than the original, he never had to make
any modifications to the diskette in order to use it.
By contrast, when the mainframe computer system was
first delivered to Ms. Jones' company seven years earlier,
its operating system could not be run immediately. To get
the computer running, the initial program load was done
using a "starter system" with only a minimum number of
functions. Then the system generation, or SYSGEN, had to be
performed, which would optimize the system to the company's
hardware configuration. By omitting from the generated
system software to control devices that were not included at
her installation, valuable main storage was saved. System
programmers who work for Ms. Jones' company repeat the
system generation procedure whenever new peripherals are
attached or removed.
That Saturday night, when no users were on the system,
a new SYSGEN was performed, which included support for the
new badge reader. By Monday, the security system worked
perfectly.
Program Development On microcomputers, many users find that it is easy to
learn the BASIC programming language and to write simple
programs. On mainframes, users almost never get involved in
programming. Instead, they rely on teams of application
programmers from the Data Processing department. These teams
usually work on implementing one project at a time.
Later they can add small enhancements from time to time
as requested by a user. For instance, someone may ask that a
report print subtotals in addition to totals. One programmer
might do this in a week or less, since little testing would
be necessary.
When Ms. Jones wanted the zip code in the personnel
file to be increased from five to nine digits, she discussed
the matter with the DP department. After outlining the
specifics in a memo, she was told that a three application
programmers had been assigned to the task, and that the
change would take eight weeks.
This seemed like a long time to her. She was told that
much care would go into insuring that no changes affected
any other program or system that may also be using the same
files. The plan for enlarging the zip code contained the
following steps:
This last step had to be done when no online systems
were using the personnel file, probably on a weekend. After
ten weeks, the change was made to Ms. Jones' satisfaction.
She was surprised to learn that over 150 compilations had
been required, and more than 50 test runs.
From then on, all applicants for employment could enter
a nine digit zip code on their questionnaires. If hired,
this more valid code would be used in reporting to
government agencies.
The compilers involved in changing the programs, and
several other programming tools, all constitute system
software. In this case the COBOL programming language was
used. Almost all application programs are written in COBOL
because it is not as hard to understand as some other
languages.
PL/I was used for several years at Ms. Jones' company,
but it was more complicated to learn than COBOL and became
unpopular. RPG, which stands for Report Program Generator,
is still used by some departments. The programmer fills in
simple tables to indicate what the program should do. This
is useful for simple reports, but it is not easy to write
programs in RPG that can make complex decisions.
Perhaps the most powerful, and definitely the most
difficult, language used at Ms. Jones' company was
Assembler. Basic Assembler Language, or BAL, allowed the
programmer to write out the very machine language
instructions that the computer would execute. These hardware
instructions would refer to architectural features of the
computer such as "masks", "halfwords", and "floating point
registers". Assembler was used by the system programmers to
make special modifications to the operating system.
Batch Processing Another factor complicating mainframe system software
is historical. Mr. Williams is used to a computer that
interacts with a user whenever it is doing anything. But
this instant attention to a user is a luxury that only
became affordable in recent years.
As recently as ten years ago, interacting with a
computer system often meant asking an applications
programmer to prepare some keypunched control cards. A
keypunch clerk had to key them up from the programmer's
prepared forms. A job entry clerk in the operations
department had to feed them into the card reader. The
computer operator had to start the job. Somebody in the
printer pool finally had to tear the listing off the
printer. The computer itself was strictly hands off to
users. It might take a week to run such a job, only to find
that a keypunch error made the results useless.
Though error-prone, this hands-off mode of operation
actually makes sense when the computer is operating on
enormous amounts of data. Modern system software supports
the same facilities now as it did to support batch
processing twenty years ago. Batch jobs such as a three-hour
file sort are started by control cards written in a special
language called Job Control Language. JCL cards are fed into
the system and are acted upon one at a time until the entire
job is finished.
Job Entry Modern JCL processing consumes so much computer time
that entire sections of the operating system have been
broken off into "job entry subsystems". The one used in Ms.
Jones' installation, called JES3, handles the reading in and
interpreting of JCL, scheduling of jobs, allocation of
files, mounting of tapes, and printing of output.
JES3 even has its own kind of "super JCL", through
which dependencies between different jobs can be
established. It's possible, for example, to ask that if the
Personnel Master File Update job completes successfully,
make the Personnel Master File List job eligible for
initiation, otherwise release the Personnel Master File
Restore job at a low priority but increase its priority
periodically so as to make sure that it starts executing by
midnight.
Compatibility A final reason for the complexity of mainframe system
software is the fact that every new release of the operating
system must be upward-compatible. That means that every
program written to work with the old version must work as
well with the new. This places great burdens on the process
of enhancing the system. Imagine trying to design a building
that not only had to meet current building codes, but all
codes ever written even when contradictory in their aims.
System programmers who have to upgrade a sophisticated
mainframe operating system do just that.
System software must also be compatible with the latest
hardware enhancements. New computer models usually require
significant software modifications. When IBM released its
Extended Architecture (XA) hardware feature, which increased
the maximum size of main memory from less than 17 million
characters to more than 4 billion, many thousands of
instructions were added to several of its operating systems
in order to see that it was properly supported.
Utilities Large operating systems provide a set of general
purpose utility programs just as MS/DOS does. There are
utility programs to initialize disks, to copy selected files
or entire disks, to print files in various standard formats,
and to diagnose hardware faults.
Some utilities are supplied by third-party hardware and
software vendors. The database management system (DBMS) used
in Ms. Jones' company, for example, has a program to change
the organization of records in the database. This is
occasionally necessary, for example, to insert a new field
between two other fields in all the records of a file. The
authority to use this utility is restricted.
One of the most important utilities is the SORT
program. Hundreds of jobs that run every day use this
program to rearrange the order of records in files so as to
derive counts, totals, subtotals, averages, maximums,
minimums and other statistics on various fields within each
record.
Performance Monitoring Other utility programs constitute special high-level
tools for the system programmers who monitor and optimize
the system. Mainframes run a mix of jobs whose resource
utilization varies greatly from moment to moment. One of the
things that is always on the mind of data processing
managers is whether or not the system is optimally
configured. Are the tape drives usually idle? Are programs
which need disk access often put in a wait state? Is the
time-sharing system so overloaded when people come back from
lunch that by 3:00 PM everything more or less grinds to a
halt?
To aid the data processing manager in fine tuning the
system, a number of performance monitoring tools are
available. Some of these are built into the mainframe
operating system and keep a running log of all important
events that occur within the system. The kinds of events
logged are selected when the system programmer performs a
SYSGEN to generate the system. This history can later be
analysed, just like the data in a spreadsheet, to uncover
trends, peaks and bottlenecks.
On one occasion, Ms. Jones noticed that her Employee
Overtime Workload Report took longer to run on the computer
than usual. A system programmer from the technical unit was
asked to extract data from the historical performance
records and analyze the problem. It was determined that
another job that used the same files was accidentally
scheduled to run at the same time as hers. Rescheduling
these jobs solved the problem.
Other performance tools allow a system operator to
watch the system load factors on a display, much as a
nuclear power engineer observes a battery of gauges. The
operator can then jump in and alter the priorities of
various jobs.
For example, if the disk drive containing the company
master files is so busy that branch offices around the
country are not getting responses within five seconds, the
operator can tell the system to suspend, or even cancel,
other jobs using that disk drive. Ms. Jones learned about
this capability once when her Quarterly Payroll Summary
wasn't delivered on time. The system operator had cancelled
her job in deference to a special report requested by the
comptroller, and had forgotten to reschedule it.
If the performance monitoring software is more
sophisticated, it may be possible for the system operator to
define a contention threshold. If exceeded, the software
will automatically suspend the lowest priority competing
batch job.
So-called "expert systems" which attempt to automate
partially the role of the system operator, such as the
Yorktown Expert System (YES/MVS), are in development at
IBM's Thomas J. Watson Research Center and elsewhere.
Virtual Storage Since the early 70's and the introduction of the IBM
370 mainframe computer, there have been several milestones
in system software for mainframes. First there was the
introduction of virtual storage, which allows programs to
function as if they have much more real storage than is
actually available on the machine.
By managing a set of "pages" of real memory, each of
which contains 4096 characters, the operating system
periodically writes to disk those pages that haven't been
referred to by any program recently. Special hardware tables
record where these least recently used (LRU) pages are.
When a program subsequently refers to that page, a
hardware interrupt occurs indicating "invalid address". A
special system software program called the "virtual storage
manager" is activated. It "pages in" the storage need by the
program. The virtual page brought in can reside in any
available page of real memory and the user's program doesn't
know the difference. The computer's "address space" is now
divided into real memory and virtual memory.
If the system load is high, and paging is frequent,
some pages may be paged out and back again incessantly. Such
behavior, called "thrashing", can be a problem when not
enough real storage underlies virtual storage, and is an
undesirable complication in any virtual storage system.
Virtual Machines Another development was the invention of the "virtual
machine". Since software is a kind of data processed by the
computer, it seems as if hardware is always in control.
Hardware does manipulate software according to its
principles of operation, but software also manipulates
hardware to the extent that peripherals are designed to be
so controlled. In the case of a "virtual machine", the
operating system controls computer hardware whose existence
is simulated by yet other operating system.
VM, which is the name of the "super operating system"
invented by IBM to implement the virtual machine concept,
developed originally out of the need which operating system
developers had for more access to stand-alone mainframes.
When testing a possibly faulty version of an operating
system, the initial program load (IPL) has to be performed
on a computer that isn't being used for anything else for
the simple reason that it is very likely not to work right.
The larger the computer, the more expensive this process is.
In VM, the computer is simulated through software. The
supervisor program that thinks it is controlling all aspects
of the computer system is actually monitored and isolated
from the real hardware by a still higher supervisor, a
"hypervisor".
Actually, it turns out to be simple in principle to
program this hypervisor. The operating system being tested
runs in "problem program mode". This means that it is not
allowed to issue instructions that are so powerful that only
supervisory programs should use them. Whenever this
operating system attempts to reconfigure a peripheral or
alter tables in the nucleus or shared part of memory, a
hardware interrupt is triggered. Normally this interrupt
causes the cancellation of the faulty program, but in VM the
hypervisor first checks to see if the issuer of the
instruction is an operating system which might legitimately
want to perform such a function. If it is, the function is
simulated within the hypervisor, and control returned to the
operating system. The operating system never knows the
difference.
Although at first a tool for developers of operating
systems, VM has surprised everyone by becoming a popular
adjunct to more traditional mainframe operating systems. For
one thing, it allows a single mainframe to run several
different operating systems at the same time. This is a
tremendous help to small installations that want to convert
from an operating system like VS/1 to a more flexible
operating system like MVS since both systems can run on a
single computer at the same time. Another reason is that the
Conversational Monitoring System, a single-user time-sharing
operating system running under VM, is an inexpensive
alternative to the more powerful Time Sharing Option
developed for MVS.
|