Dean's Den

A chapter I wrote on "System Software" in 1984

In 1984 I wrote a chapter called "System Software" for a new Prentice-Hall textbook entitled Using Computers. All of the contributors were told to use examples refering to a fictitious "Mr. Williams" and "Ms. Jones."

Mainframe System Software

Mr. Williams found learning about system software to be a rather trying experience, but he was much better off than his sister-in-law, Ms. Jones. System software in the mainframe world is much more complicated than in the microcomputer world. Fortunately, Ms. Jones needed little or no expert knowledge of it in order to deal successfully with her data processing department. Many layers of support personnel acted as a buffer between her and this aspect of the computer system.

The reasons for this complexity are various. When a computer is running, many things are going on. At the center is the control program or supervisor. This program, which is packaged with other standard programs by the computer manufacturer to make up an "operating system", has direct control of the computer. It decides which other programs to start and stop. These other programs include general purpose programs that might be useful to anybody using the computer, as well as programs that apply to a particular business or situation. These latter are termed "application programs" to distinguish them from "system programs", and are usually developed specifically for the company that has purchased the computer system.

Application programs may themselves run programs that are part of the operating system. Every time a program reads information out of a disk file, it must run a special system program called an "access method". If it communicates with a user through a terminal it will use telecommunications software that probably comes with the system.

Multi-Programming, Multi-Processing

Probably the biggest complicating factor in system software is simply the large number of computing tasks going at at the same time. If a program running in a microcomputer is like a researcher reading a book in a library, the program mix churning in a mainframe resembles the floor of the New York Stock Exchange.

All computers require a control program, utilities, and application software. What makes the job of a mainframe operating system more complex is that there are always many programs going on at the same time. The computer doesn't perform separate operations simultaneously, but interleaves the instructions from many programs in the most efficient way. It is like a one-man band played with one hand, but so rapidly that it sounds like a symphony. The capability to run more than one program at the same time in this way is called multi-programming.

Another technique for increasing the work flow is called multi-processing. "Processing" here refers to the fact that a computer consists of a central processor connected to main storage. In large mainframe systems there is often so much main storage that most of it lies idle most of the time. By attaching a second "co-processor", the computer can simultaneously run different programs in the same main storage as long as they don't interfere with one another.

Mainframes must support a large number of users simultaneously, either through multi-programming, multi-processing or both. Many of these users communicate and share data with other users in the system. Many require special peripherals. Some may communicate with other computer systems over telecommunications lines. All of these functions require expensive hardware and complicated system software.

Monitoring the use of these resources becomes a constant preoccupation of the system operators overseeing the work flow. It was natural for system software to evolve to simplify and partly automate this job.

An important area of system software links the computer with terminals at which users and programmers can work. In MVS, the most powerful operating system, terminals usually connect using either the Time Sharing Option (TSO), or the Customer Information Control System (CICS). In Ms. Jones' company, TSO is used by programmers to write and test programs, and CICS serves the users in the various departments who need online information.

The operating system must not only provide numerous services to the programs running under it, but it must also see that the programs and sub-systems "get along" with one another. It needs to prevent the locking out of specific records from shared files. It must prevent the lock-out of tape or disk drives. It must make sure that unique peripherals such as optical character readers and laser printers are available to top priority programs.

Computer peripherals in a modern data processing installation are likely also to use a greater variety of media and employ a wider range of storage strategies than in a microcomputer environment. In addition, whereas Mr. William's system was manufactured by a single company, the computer room in Ms. Jones' company includes peripherals from over a dozen separate manufacturers, and fills an entire floor of a large downtown building. The software required to control sophisticated peripherals such as the Mass Storage System (MSS), for example, may be several times as complicated as MS/DOS in its entirety.

Online Systems, Real-Time Systems, and Time Sharing

Mr. Williams would probably be amused to learn that he can do many things on his small computer that were considered advanced functions for large computers just a few years ago. For example, in the days when most processing occurred on batches of data without human intervention, it was difficult for someone to walk up to a computer and "ask it a question". Online systems quickly evolved, however, to give instant users access to current data, in just the same way that Mr. Williams can turn his computer on and get to any information he wants.

Real-time systems were even more prompt. A system was only considered to operate in "real-time" if it could answer a question fast enough to make an immediate difference. An autopilot in a fighter jet, for instance, is actually a real-time computer system. In the mainframe world, a typical real-time system might be the software that examines a charge card before dispensing money from a cash machine.

When more and more people needed access to online systems, system software had to be developed which would allow the sharing of valuable computer time. Time sharing became an important capability of a large computer. It had to give users of equal rank equal time, and users of first rank top priority.

System Generation

As Manager of Personnel, Ms. Jones was on the committee to select a new security system to restrict employee access to computer areas. In the system that was selected, an employee would insert a special identification badge into a mechanical card reader. This reader, which was attached to the mainframe computer, would transmit the employee's identification number to software which would check to see if the employee was still authorized to enter that area. Since this verification operated in real-time, the program that did the checking operating at very high priority.

On the morning the new system was to go into effect, however, Ms. Jones was surprised to find that it was not working. That afternoon she learned that the badge reader, although properly attached to the hardware, was not yet known to the operating system. She also found out that to properly support the new device a new "system generation" would have to be performed that weekend.

To understand what it means for a mainframe operating system to be "generated", let's remember Mr. Williams' situation. When he unpacked his microcomputer, he found a diskette containing the operating system called MS/DOS. He put the diskette in the diskette drive and turned the computer on. The computer automatically read in the software and began running MS/DOS. This initial program load, or IPL, occurred each time the computer was turned on. Even though Mr. Williams later made a copy of the MS/DOS diskette and used the copy rather than the original, he never had to make any modifications to the diskette in order to use it.

By contrast, when the mainframe computer system was first delivered to Ms. Jones' company seven years earlier, its operating system could not be run immediately. To get the computer running, the initial program load was done using a "starter system" with only a minimum number of functions. Then the system generation, or SYSGEN, had to be performed, which would optimize the system to the company's hardware configuration. By omitting from the generated system software to control devices that were not included at her installation, valuable main storage was saved. System programmers who work for Ms. Jones' company repeat the system generation procedure whenever new peripherals are attached or removed.

That Saturday night, when no users were on the system, a new SYSGEN was performed, which included support for the new badge reader. By Monday, the security system worked perfectly.

Program Development

On microcomputers, many users find that it is easy to learn the BASIC programming language and to write simple programs. On mainframes, users almost never get involved in programming. Instead, they rely on teams of application programmers from the Data Processing department. These teams usually work on implementing one project at a time.

Later they can add small enhancements from time to time as requested by a user. For instance, someone may ask that a report print subtotals in addition to totals. One programmer might do this in a week or less, since little testing would be necessary.

When Ms. Jones wanted the zip code in the personnel file to be increased from five to nine digits, she discussed the matter with the DP department. After outlining the specifics in a memo, she was told that a three application programmers had been assigned to the task, and that the change would take eight weeks.

This seemed like a long time to her. She was told that much care would go into insuring that no changes affected any other program or system that may also be using the same files. The plan for enlarging the zip code contained the following steps:

  1. Make a list of all programs that refer to the personnel file.
  2. Make separate "test" versions of these programs.
  3. Change their descriptions of the zip code field to include the extra digits.
  4. Compile the programs. This would translate them from words into machine language.
  5. Write a special program to read in the old personnel file and write out a new one in the new format, filling the additional digits with zeros.
  6. Test the new versions of the programs against the reformatted file.
  7. Erase the old versions of the programs and the personnel file, and install the new versions.

This last step had to be done when no online systems were using the personnel file, probably on a weekend. After ten weeks, the change was made to Ms. Jones' satisfaction. She was surprised to learn that over 150 compilations had been required, and more than 50 test runs.

From then on, all applicants for employment could enter a nine digit zip code on their questionnaires. If hired, this more valid code would be used in reporting to government agencies.

The compilers involved in changing the programs, and several other programming tools, all constitute system software. In this case the COBOL programming language was used. Almost all application programs are written in COBOL because it is not as hard to understand as some other languages.

PL/I was used for several years at Ms. Jones' company, but it was more complicated to learn than COBOL and became unpopular. RPG, which stands for Report Program Generator, is still used by some departments. The programmer fills in simple tables to indicate what the program should do. This is useful for simple reports, but it is not easy to write programs in RPG that can make complex decisions.

Perhaps the most powerful, and definitely the most difficult, language used at Ms. Jones' company was Assembler. Basic Assembler Language, or BAL, allowed the programmer to write out the very machine language instructions that the computer would execute. These hardware instructions would refer to architectural features of the computer such as "masks", "halfwords", and "floating point registers". Assembler was used by the system programmers to make special modifications to the operating system.

        Figure 3-11. COBOL PROGRAM EXAMPLE

GET-TENANT-INFO.
    DISPLAY "Type tenant information as requested:".
    DISPLAY " ".
    DISPLAY "   Apartment number:".
    ACCEPT T-APT-NO.
    DISPLAY "   Tenant last name:".
    ACCEPT T-LAST-NAME.
    DISPLAY "   Monthly rent amount:".
    ACCEPT T-RENT.
    DISPLAY "   Other charges amount:".
    ACCEPT T-OTHER.
    ADD T-RENT TO T-OTHER GIVING TOTAL-DUE.
    DISPLAY " ".
    DISPLAY "Total amount due = $" TOTAL-DUE.

Sample COBOL program which corresponds to the BASIC
program in Figure 3-7.

Batch Processing

Another factor complicating mainframe system software is historical. Mr. Williams is used to a computer that interacts with a user whenever it is doing anything. But this instant attention to a user is a luxury that only became affordable in recent years.

As recently as ten years ago, interacting with a computer system often meant asking an applications programmer to prepare some keypunched control cards. A keypunch clerk had to key them up from the programmer's prepared forms. A job entry clerk in the operations department had to feed them into the card reader. The computer operator had to start the job. Somebody in the printer pool finally had to tear the listing off the printer. The computer itself was strictly hands off to users. It might take a week to run such a job, only to find that a keypunch error made the results useless.

Though error-prone, this hands-off mode of operation actually makes sense when the computer is operating on enormous amounts of data. Modern system software supports the same facilities now as it did to support batch processing twenty years ago. Batch jobs such as a three-hour file sort are started by control cards written in a special language called Job Control Language. JCL cards are fed into the system and are acted upon one at a time until the entire job is finished.

Job Entry

Modern JCL processing consumes so much computer time that entire sections of the operating system have been broken off into "job entry subsystems". The one used in Ms. Jones' installation, called JES3, handles the reading in and interpreting of JCL, scheduling of jobs, allocation of files, mounting of tapes, and printing of output.

JES3 even has its own kind of "super JCL", through which dependencies between different jobs can be established. It's possible, for example, to ask that if the Personnel Master File Update job completes successfully, make the Personnel Master File List job eligible for initiation, otherwise release the Personnel Master File Restore job at a low priority but increase its priority periodically so as to make sure that it starts executing by midnight.

Compatibility

A final reason for the complexity of mainframe system software is the fact that every new release of the operating system must be upward-compatible. That means that every program written to work with the old version must work as well with the new. This places great burdens on the process of enhancing the system. Imagine trying to design a building that not only had to meet current building codes, but all codes ever written even when contradictory in their aims. System programmers who have to upgrade a sophisticated mainframe operating system do just that.

System software must also be compatible with the latest hardware enhancements. New computer models usually require significant software modifications. When IBM released its Extended Architecture (XA) hardware feature, which increased the maximum size of main memory from less than 17 million characters to more than 4 billion, many thousands of instructions were added to several of its operating systems in order to see that it was properly supported.

Utilities

Large operating systems provide a set of general purpose utility programs just as MS/DOS does. There are utility programs to initialize disks, to copy selected files or entire disks, to print files in various standard formats, and to diagnose hardware faults.

Some utilities are supplied by third-party hardware and software vendors. The database management system (DBMS) used in Ms. Jones' company, for example, has a program to change the organization of records in the database. This is occasionally necessary, for example, to insert a new field between two other fields in all the records of a file. The authority to use this utility is restricted.

One of the most important utilities is the SORT program. Hundreds of jobs that run every day use this program to rearrange the order of records in files so as to derive counts, totals, subtotals, averages, maximums, minimums and other statistics on various fields within each record.

Performance Monitoring

Other utility programs constitute special high-level tools for the system programmers who monitor and optimize the system. Mainframes run a mix of jobs whose resource utilization varies greatly from moment to moment. One of the things that is always on the mind of data processing managers is whether or not the system is optimally configured. Are the tape drives usually idle? Are programs which need disk access often put in a wait state? Is the time-sharing system so overloaded when people come back from lunch that by 3:00 PM everything more or less grinds to a halt?

To aid the data processing manager in fine tuning the system, a number of performance monitoring tools are available. Some of these are built into the mainframe operating system and keep a running log of all important events that occur within the system. The kinds of events logged are selected when the system programmer performs a SYSGEN to generate the system. This history can later be analysed, just like the data in a spreadsheet, to uncover trends, peaks and bottlenecks.

On one occasion, Ms. Jones noticed that her Employee Overtime Workload Report took longer to run on the computer than usual. A system programmer from the technical unit was asked to extract data from the historical performance records and analyze the problem. It was determined that another job that used the same files was accidentally scheduled to run at the same time as hers. Rescheduling these jobs solved the problem.

Other performance tools allow a system operator to watch the system load factors on a display, much as a nuclear power engineer observes a battery of gauges. The operator can then jump in and alter the priorities of various jobs.

For example, if the disk drive containing the company master files is so busy that branch offices around the country are not getting responses within five seconds, the operator can tell the system to suspend, or even cancel, other jobs using that disk drive. Ms. Jones learned about this capability once when her Quarterly Payroll Summary wasn't delivered on time. The system operator had cancelled her job in deference to a special report requested by the comptroller, and had forgotten to reschedule it.

If the performance monitoring software is more sophisticated, it may be possible for the system operator to define a contention threshold. If exceeded, the software will automatically suspend the lowest priority competing batch job.

So-called "expert systems" which attempt to automate partially the role of the system operator, such as the Yorktown Expert System (YES/MVS), are in development at IBM's Thomas J. Watson Research Center and elsewhere.

Virtual Storage

Since the early 70's and the introduction of the IBM 370 mainframe computer, there have been several milestones in system software for mainframes. First there was the introduction of virtual storage, which allows programs to function as if they have much more real storage than is actually available on the machine.

By managing a set of "pages" of real memory, each of which contains 4096 characters, the operating system periodically writes to disk those pages that haven't been referred to by any program recently. Special hardware tables record where these least recently used (LRU) pages are.

When a program subsequently refers to that page, a hardware interrupt occurs indicating "invalid address". A special system software program called the "virtual storage manager" is activated. It "pages in" the storage need by the program. The virtual page brought in can reside in any available page of real memory and the user's program doesn't know the difference. The computer's "address space" is now divided into real memory and virtual memory.

If the system load is high, and paging is frequent, some pages may be paged out and back again incessantly. Such behavior, called "thrashing", can be a problem when not enough real storage underlies virtual storage, and is an undesirable complication in any virtual storage system.

Virtual Machines

Another development was the invention of the "virtual machine". Since software is a kind of data processed by the computer, it seems as if hardware is always in control. Hardware does manipulate software according to its principles of operation, but software also manipulates hardware to the extent that peripherals are designed to be so controlled. In the case of a "virtual machine", the operating system controls computer hardware whose existence is simulated by yet other operating system.

VM, which is the name of the "super operating system" invented by IBM to implement the virtual machine concept, developed originally out of the need which operating system developers had for more access to stand-alone mainframes. When testing a possibly faulty version of an operating system, the initial program load (IPL) has to be performed on a computer that isn't being used for anything else for the simple reason that it is very likely not to work right. The larger the computer, the more expensive this process is.

In VM, the computer is simulated through software. The supervisor program that thinks it is controlling all aspects of the computer system is actually monitored and isolated from the real hardware by a still higher supervisor, a "hypervisor".

Actually, it turns out to be simple in principle to program this hypervisor. The operating system being tested runs in "problem program mode". This means that it is not allowed to issue instructions that are so powerful that only supervisory programs should use them. Whenever this operating system attempts to reconfigure a peripheral or alter tables in the nucleus or shared part of memory, a hardware interrupt is triggered. Normally this interrupt causes the cancellation of the faulty program, but in VM the hypervisor first checks to see if the issuer of the instruction is an operating system which might legitimately want to perform such a function. If it is, the function is simulated within the hypervisor, and control returned to the operating system. The operating system never knows the difference.

Although at first a tool for developers of operating systems, VM has surprised everyone by becoming a popular adjunct to more traditional mainframe operating systems. For one thing, it allows a single mainframe to run several different operating systems at the same time. This is a tremendous help to small installations that want to convert from an operating system like VS/1 to a more flexible operating system like MVS since both systems can run on a single computer at the same time. Another reason is that the Conversational Monitoring System, a single-user time-sharing operating system running under VM, is an inexpensive alternative to the more powerful Time Sharing Option developed for MVS.

 


Dean's Den

[E:\DH\DHD\HTP\CHAPTER3.HTP (496 lines) 1999-08-26 19:46 Dean Hannotte]

1