MAINFRAMES: cobol

COBOL

COBOL, an acronym for COmmon Business Oriented Language,

COBOL History

1952	Grace Hopper, "the mother of COBOL", begins developing computer languages.
1959	The American Department of Defense (DOD) asked a group of specialists to develop a business language that met their demands.
1960	COBOL-60 (Common Business Oriented Language) is launched.
1961	First COBOL compilers are available.
1965	The momentum of COBOL success accellerates.
1968	The first COBOL standard, COBOL-68 is released.
1970	The COBOL-68 standard is accepted by The International Organization for Standardization (ISO).
1974	The COBOL-74 standard is released.
1985	The COBOL-85 standard is released.
1989	Intrinsic functions are added to the standard.
2002	The COBOL 2002 standard is released with object oriented capabilities.

Introduction

COBOL is a high-level programming language first developed by the CODASYL Committee (Conference on DataSystems Languages) in 1960. Since then, responsibility for developing new COBOL standards has been assumed by the American National Standards Institute (ANSI).

Three ANSI standards for COBOL have been produced: in 1968, 1974 and 1985. A new COBOL standard introducing object-oriented programming to COBOL, is due within the next few years.

The word COBOL is an acronym that stands for COmmon Business Oriented Language. As the the expanded acronym indicates, COBOL is designed for developing business, typically file-oriented, applications. It is not designed for writing systems programs. For instance you would not develop an operating system or a compiler using COBOL.

How widely used is COBOL?

For over four decades COBOL has been the dominant programming language in the business computing domain. In that time it it has seen off the challenges of a number of other languages such as PL1, Algol68, Pascal, Modula, Ada, C, C++. All these languages have found a niche but none has yet displaced COBOL. Two recent challengers though, Java and Visual Basic, are proving to be serious contenders.

COBOL's dominance in underlined by the reports from the Gartner group.

In 1997 they estimated that there were about 300 billion lines of computer code in use in the world. Of that they estimated that about 80% (240 billion lines) were in COBOL and 20% (60 billion lines) were written in all the other computer languages combined.

In 1999 they reported that over 50% of all new mission-critical applications were still being done in COBOL and their recent estimates indicate that through 2004-2005 15% of all new applications (5 billion lines) will be developed in COBOL while 80% of all deployed applications will include extensions to existing legacy (usually COBOL) programs.

Gartner estimates for 2002 are that there are about two million COBOL programmers world-wide compared to about about one million Java programmers and one million C++ programmers.

Surprised by COBOL's success?

People are often surprised when presented with the evidence for COBOL's dominance in the market place. The hype that surrounds some computer languages would persuade you to believe that most of the production business applications in the world are written in Java, C, C++ or Visual Basic and that only a small percentage are written in COBOL. In fact, the reverse is actually the case.

One reason for this misconception lies in the difference between the vertical and the horizontal software markets.

In the vertical software market (sometimes called "bespoke" software) applications cost many millions of dollars to produce, are tailored to a specified company, encapsulate the business rules of that company, and only a limited number of copies of the software may be in use. A good example of this kind of application is the DoD MRP II system. This system is "used to manage almost 550,000 spare and repair parts and equipment items with an inventory value of $28 billion. The system runs on Amdahl mainframes at multiple locations throughout the U.S. and contains over 4,000,000 lines of COBOL code."

In the horizontal software market, applications may still cost millions of dollars to produce but thousands, and in some cases millions, of copies of the software are in use. As a result, these applications often have a very high profile, a short life span, and a relatively low per-copy replacement cost. The Microsoft Office suite (Word, Excel, Access) is an example of an application in the horizontal software market. Because of the highly competitive nature of this marketplace considerations of speed, size and efficiency often make languages like C or C++ the language of choice for creating these applications.

Applications written for the vertical market, on the other hand, often have a low profile (because they are usually written for use in one particular company), a very high per-copy replacement cost, and consequently, a very long life span. For example, the cost of replacing COBOL code has been estimated at approximately twenty five dollars ($25) per line of code. At this rate, the cost of replacing the DoD MRP II system mentioned above, with a system written in some other language, would be some one hundred million dollars ($100,000,000). The importance of ease of maintenance often makes COBOL the language of choice for these applications.

The high visibility of horizontal applications like Microsoft Word or Excel persuades people that the languages used to write these applications are the market leaders. But however many copies of Excel are sold, it is just a single application produced by a limited number of programmers. Many more programmers are involved in coding or maintaining one off, "bespoke", applications. And these programmers generally write their programs in COBOL.

Characteristics of COBOL.

COBOL is a simple language (no pointers, no user defined functions, no user defined types) with a limited scope of function. It encourages a simple straightforward programming style. Curiously enough though, despite its limitations, COBOL has proven itself to be well suited to its targeted problem domain (business computing). Most COBOL programs operate in a domain where the program complexity lies in the business rules that have to be encoded rather than in the sophistication of the data structures or algorithms required. And in cases where sophisticated algorithms are required COBOL usually meets the need with an appropriate verb such as the SORTand the SEARCH.

We noted above that COBOL is a simple language with a limited scope of function. And that is the way it used to be but the introduction of OO-COBOL has changed all that. OO-COBOL retains all the advantages of previous versions but now includes -

User Defined Functions
Object Orientation
National Characters - Unicode
Multiple Currency Symbols
Cultural Adaptability (Locales)
Dynamic Memory Allocation (pointers)
Data Validation Using New VALIDATE Verb
Binary and Floating Point Data Types
User Defined Data Types

COBOL is non-proprietary (portable)

The COBOL standard does not belong to any particular vendor. The vendor independent ANSI COBOL committee legislates formal, non-vendor-specific syntax and semantic language standards. COBOL has been ported to virtually every hardware platform - from every favour of Windows, to every falser of Unix, to AS/400, VSE, OS/2, DOS, VMS, Unisys, DG, VM, and MVS.

COBOL is Maintainable

COBOL has a 30 year proven track record for application maintenance, enhancement and production support at the enterprise level. Early indications from the year 2000 problem are that COBOL applications were actually cheaper to fix than applications written in more recent languages.

Free Reciprocal Link Exchange Software - Automatic Reciprocal Link Exchange Software (Script). Rank Top 10 In Search Engines For Your Keyword. Increase Link Popularity And Get Free Targeted Traffic. Automate Your Link Exchange Like Never Before.

Cobol Programing Basics:

Introduction

This section presents the fundamentals of constructing COBOL programs. It explains the notation used in COBOL syntax diagrams and enumerates the COBOL coding rules. It shows how user-defined names are constructed and examines the structure of COBOL programs.

COBOL syntax

COBOL syntax is defined using particular notation sometimes called the COBOL MetaLanguage.

In this notation, words in uppercase are reserved words. When underlined they are mandatory. When not underlined they are "noise" words, used for readability only, and are optional. Because COBOL statements are supposed to read like English sentences there are a lot of these "noise" words.

Words in mixed case represent names that must be devised by the programmer (like data item names).

When material is enclosed in curly braces { }, a choice must be made from the options within the braces. If there is only one option then that item in mandatory.

Material enclosed in square brackets [ ], indicates that the material is optional, and may be included or omitted as required.

The ellipsis symbol ... (three dots), indicates that the preceding syntax element may be repeated at the programmer's discretion.

COBOL coding rules

Traditionally, COBOL programs were written on coding forms and then punched on to punch cards. Although nowadays most programs are entered directly into a computer, some COBOL formatting conventions remain that derive from its ancient punch-card history.

On coding forms, the first six character positions are reserved for sequence numbers. The seventh character position is reserved for the continuation character, or for an asterisk that denotes a comment line.

The actual program text starts in column 8. The four positions from 8 to 11 are known as Area A, and positions from 12 to 72 are Area B.

Although many COBOL compilers ignore some of these formatting restrictions, most still retain the distinction between Area A and Area B.

When a COBOL compiler recognizes the two areas, all division names, section names, paragraph names, FD entries and 01 level numbers must start in Area A. All other sentences must start in Area B.

In our example programs we use the compiler directive (available with the NetExpress COBOL compiler) - $ SET SOURCEFORMAT"FREE" - to free us from these formatting restrictions.

Ancient COBOL coding form

Name construction

vAll user-defined names, such as data names, paragraph names, section names condition names and mnemonic names, must adhere to the following rules:

vThey must contain at least one character, but not more than 30 characters. They must contain at least one alphabetic character. They must not begin or end with a hyphen.

vThey must be constructed from the characters A to Z, the numbers 0 to 9, and the hyphen.

vThey must not contain spaces.

vNames are not case-sensitive: TotalPay is the same as totalpay, Totalpay or TOTALPAY

The structure of COBOL programs

COBOL programs are hierarchical in structure. Each element of the hierarchy consists of one or more subordinate elements.

The hierarchy consists of Divisions, Sections, Paragraphs, Sentences and Statements.

A Division may contain one or more Sections, a Section one or more Paragraphs, a Paragraph one or more Sentences and a Sentence one or more Statements.

We can represent the COBOL hierarchy using the COBOL metalanguage as follows;

Divisions
A division is a block of code, usually containing one or more sections, that starts where the division name is encountered and ends with the beginning of the next division or with the end of the program text.

Sections
A section is a block of code usually containing one or more paragraphs. A section begins with the section name and ends where the next section name is encountered or where the program text ends.

Section names are devised by the programmer, or defined by the language. A section name is followed by the word SECTION and a period.
See the two example names below -

SelectUnpaidBills SECTION.
FILE SECTION.

Paragraphs
A paragraph is a block of code made up of one or more sentences. A paragraph begins with the paragraph name and ends with the next paragraph or section name or the end of the program text.

A paragraph name is devised by the programmer or defined by the language, and is followed by a period.
See the two example names below -

PrintFinalTotals.
PROGRAM-ID.

Sentences and statements
A sentence consists of one or more statements and is terminated by a period.
For example:

MOVE .21 TO VatRate
MOVE 1235.76 TO ProductCost
COMPUTE VatAmount = ProductCost * VatRate.

A statement consists of a COBOL verb and an operand or operands.
For example:

SUBTRACT Tax FROM GrossPay GIVING NetPay

The Four Divisions

At the top of the COBOL hierarchy are the four divisions. These divide the program into distinct structural elements. Although some of the divisions may be omitted, the sequence in which they are specified is fixed, and must follow the order below.

General Layout

IDENTIFICATION DIVISION.
Contains program information

ENVIRONMENT DIVISION.
Contains environment information

DATA DIVISION.
Contains data descriptions

PROCEDURE DIVISION.
Contains the program algorithms

The IDENTIFICATION DIVISION

The IDENTIFICATION DIVISION supplies information about the program to the programmer and the compiler.

Most entries in the IDENTIFICATION DIVISION are directed at the programmer. The compiler treats them as comments.

The PROGRAM-ID clause is an exception to this rule. Every COBOL program must have a PROGRAM-ID because the name specified after this clause is used by the linker when linking a number of subprograms into one run unit, and by the CALL statement when transferring control to a subprogram.

The IDENTIFICATION DIVISION has the following structure:

IDENTIFICATION DIVISION
PROGRAM-ID. ProgramName.
[AUTHOR. ProgramerName.]
other entries here

The keywords - IDENTIFICATION DIVISION - represent the division header, and signal the commencement of the program text.

PROGRAM-ID is a paragraph name that must be specified immediately after the division header.

NameOfProgram is a name devised by the programmer, and must satisfy the rules for user-defined names.

Here's a typical program fragment:

The ENVIRONMENT DIVISION

The ENVIRONMENT DIVISION is used to describe the environment in which the program will run.

The purpose of the ENVIRONMENT DIVISION is to isolate in one place all aspects of the program that are dependant upon a specific computer, device or encoding sequence.

The idea behind this is to make it easy to change the program when it has to run on a different computer or one with different peripheral devices.

In the ENVIRONMENT DIVISION, aliases are assigned to external devices, files or command sequences. Other environment details, such as the collating sequence, the currency symbol and the decimal point symbol may also be defined here.

The DATA DIVISION

As the name suggests, the DATA DIVISION provides descriptions of the data-items processed by the program.

The DATA DIVISION has two main sections: the FILE SECTION and the WORKING-STORAGE SECTION. Additional sections, such as the LINKAGE SECTION (used in subprograms) and the REPORT SECTION (used in Report Writer based programs) may also be required.

The FILE SECTION is used to describe most of the data that is sent to, or comes from, the computer's peripherals.

The WORKING-STORAGE SECTION is used to describe the general variables used in the program.

The DATA DIVISION has the following structure and syntax:

IDENTIFICATION DIVISION.
PROGRAM-ID. SequenceProgram.
AUTHOR. XXXXX.

DATA DIVISION.
WORKING-STORAGE SECTION.
01 Num1 PIC 9 VALUE ZEROS.
01 Num2 PIC 9 VALUE ZEROS.
01 Result PIC 99 VALUE ZEROS

The PROCEDURE DIVISION

The PROCEDURE DIVISION contains the code used to manipulate the data described in the DATA DIVISION. It is here that the programmer describes his algorithm.

The PROCEDURE DIVISION is hierarchical in structure and consists of sections, paragraphs, sentences and statements.

Only the section is optional. There must be at least one paragraph, sentence and statement in the PROCEDURE DIVISION.

Paragraph and section names in the PROCEDURE DIVISION are chosen by the programmer and must conform to the rules for user-defined names.

SAMPLE COBOL PROGRAM

IDENTIFICATION DIVISION.

PROGRAM-ID. SequenceProgram.

AUTHOR. Michael Coughlan.

DATA DIVISION.

WORKING-STORAGE SECTION.

01 Num1 PIC 9 VALUE ZEROS.

01 Num2 PIC 9 VALUE ZEROS.

01 Result PIC 99 VALUE ZEROS.

PROCEDURE DIVISION.

CalculateResult.

ACCEPT Num1.

ACCEPT Num2.

MULTIPLY Num1 BY Num2 GIVING Result.

DISPLAY "Result is = ", Result.

STOP RUN.

CONDITIONAL STATEMENTS:

Conditional Processing
This
is where a lot of uneducated programmers come unstuck! Even though COBOL allows
the following:
IF <condition> {THEN} <statement-1> ELSE <statement-2> {END-IF}.
There
are some basic guidelines which can be applied in order to make the code more
readable and easier to maintain. These are:
vEach portion (condition, ELSE, statement-1, statement-2, END-IF) should be on a separate line.
This allows for future additions or deletions without having to modify more
lines than is necessary.
vThe word ELSE should be aligned in exactly the
same column as the IF to which it is associated. This
makes the association more obvious in the listing, especially with multiple or
nested IFs. 
vCOBOL'85 allows each condition to
be terminated with an END-IF.
Its use should be encouraged as it makes it absolutely clear where each
condition is supposed to end, thus avoiding the possibility of confusion and
mistakes. Like the ELSE, the END-IF should be aligned in exactly the
same column as IF with which it is associated. 
vStatement-1 and statement-2 should
be indented, usually by four character positions. This allows the IF, ELSE and END-IF to be more distinctive in the
listing. 
This
now gives us the following construction:
IF <condition>

<statement-1>

ELSE

<statement-2>

END-IF.
Here
are some extra guidelines for nested IFs:
vbFor each level of nested IF indent all associated lines by
four characters. This gives the following: 
IF <condition-1>

IF <condition-2>

<statement-1>

ELSE

<statement-2>

END-IF

ELSE

<statement-3>

END-IF.
vDon't ever use more than three
levels of nested IF - they are extremely difficult to
debug and maintain. 
vRemember that each ELSE is paired with the IF that immediately precedes it in
the code, not necessarily the one under which it is aligned. Take the following
example: 
IF <condition-1>

IF <condition-2>

<statement-2>

ELSE

<statement-1>.
According
to the indentation <statement-1> is supposed to be executed if <condition-1> is false, but COBOL follows its
own rules and executes <statement1> if <condition-1> is true and <condition-2> is false. This type of error is more avoidable if
the END-IF is used, as in the following
example:

IF <condition-1>

IF <condition-2>

<statement-2>

END-IF

ELSE

<statement-1>

END-IF. or... IF <condition1>

IF <condition2>

<statement-2>

ELSE

<statement-1>

END-IF

END-IF.
vIn the case where an ELSE is immediately followed by an IF without any intervening
statements (ie: where only one out of a series of conditions will be TRUE) it
is not necessary to indent at each new IF otherwise you will quickly fall off the page. Consider
the following example: 

IF X-VALUE = 1

<statememt-1> 

ELSE

IF X-VALUE = 2

<statement-2>

ELSE

IF X-VALUE = 3

<statement-3>

ELSE

IF X-VALUE = 4

<statement-4>

ELSE

IF X-VALUE = 5

<statement-5>

etc. IF X-VALUE = 1

<statememt-1> 

ELSE

IF X-VALUE = 2

<statement-2>

ELSE

IF X-VALUE = 3

<statement-3>

ELSE

IF X-VALUE = 4

<statement-4>

ELSE

IF X-VALUE = 5

<statement-5>

etc.
vWith the arrival
of COBOL'85 this should be written as follows:
vEVALUATE X-VALUE
WHEN 1 <statement-1>
WHEN 2 <statement-2>
WHEN 3 <statement-3>
WHEN 4 <statement-4>
WHEN 5 <statement-5>
WHEN OTHER .....

END-EVALUATE.
vHere
are even more guidelines for complex conditions:
vEnclose each individual condition
in parentheses. 
vIf several conditions combine to
form a group condition, (ie. all conditions have to be true in order to make
the group condition true) then enclose the whole group in parentheses as well. 
vBy having each condition on a
separate line, and by careful alignment of ANDs and ORs, it is possible to make
absolutely clear that conditions are linked or are alternatives. 
These
guidelines should produce something like this:
IF ((A = 1 OR 2 OR 3)

AND

(B NOT = 4))

OR ((C = "A" OR "Z")

OR

(D < E))

<statement>

ENDIF.
This
example, however, is rapidly approaching the stage at which it becomes too
unwieldy to be maintainable. Don't be afraid to split a complex condition into
its component parts, even if it involves the use of the GO TO statement. Don't try to prove how
clever you can be - keep it simple and straightforward.
Data Types & Variables
COBOL Data Types
Introduction
There
are three categories of data item used in COBOL programs: 
vVariables.
vLiterals.
vFigurative
Constants. 
A
data-name or identifier is the name used to identify the area of memory
reserved for a variable. A variable is a named location in memory into which a
program can put data, and from which it can retrieve data. 
Variables
Every
variable used in a COBOL program must be described in the DATA DIVISION. 
In
addition to the data-name, a variable declaration also defines the type of data
to be stored in the variable. This is known as the variable's data type.
Variable Data types 
Some
languages like Modula-2,Pascal or Ada
are described as being strongly typed. In these languages there are a
large number of different data types and the distinction between them is
rigorously enforced by the compiler. For instance, the compiler will reject a statement
that attempts to assign character value to an integer data item. 
In
COBOL, there are really only three data types -
vNumeric
vAlphanumeric
(text/string) 
vAlphabetic
The
distinction between these data types is a little blurred and only weakly enforced
by the compiler. For instance, it is perfectly possible to assign a non-numeric
value to a data item that has been declared to be numeric. 
The
problem with this lax approach to data typing is that, since COBOL programs
crash (halt unexpectedly) if they attempt to do computations on items that
contain non-numeric data, it is up to the programmer to make sure this never
happens. 
COBOL
programmers must make sure that non-numeric data is never assigned to numeric
items intended for use in calculations. Programmers who use strongly typed
languages don't need this level of discipline because the compiler ensures that
a variable of a particular types can only be assigned appropriate values.
Literals
A
literal is a data-item that consists only of the data-item value itself. It
cannot be referred to by a name. By definition, literals are constant
data-items.
There
are two types of literal - 
vString/Alphanumeric
Literals 
vNumeric
Literals 
String Literals
String/Alphanumeric
literals are enclosed in quotes and consist of alphanumeric characters. 
For
example: "Michael Ryan", "-123", "123.45" 
Numeric Literals
Numeric
literals may consist of numerals, the decimal point, and the plus or minus
sign. Numeric literals are not enclosed in quotes. 
For
example: 123, 123.45, -256, +2987.
Figurative Constants
Unlike
most other programming languages COBOL does not provide a mechanism for
creating user-defined constants but it does provide a set of special constants
called Figurative Constants. 
A
Figurative Constant may be used wherever it is legal to use a literal but
unlike literals, when a Figurative Constant is assigned to a data-item it fills
the whole item overwriting everything in it.
The
Figurative Constants are: 
vSPACE
or SPACES Acts like one or more spaces 
vZERO
or ZEROS or ZEROES Acts like one or more zeros 
vQUOTE
or QUOTES Used instead of a quotation mark 
vHIGH-VALUE
or HIGH-VALUES Uses the maximum value possible 
vLOW-VALUE
or LOW-VALUES Uses the minimum value possible 
vALL
literal Allows a ordinary literal to act as Figurative Constant 
Figurative Constant Notes
vWhen
the ALL Figurative Constant is used, it must be followed by a one character
literal. The designated literal then acts like the standard Figurative
Constants. 
vZERO,
ZEROS and ZEROES are synonyms, not separate Figurative Constants. The same
applies to SPACE and SPACES, QUOTE and QUOTES, HIGH-VALUE and HIGH-VALUES, LOW-VALUES
and LOW-VALUES.
FilesHandling:
Sequential files 
COBOL
is generally used in situations where the volume of data to be processed is
large. These systems are sometimes referred to as “data intensive” systems.
Generally, large volumes arise not because the data is inherently voluminous
but because the same items of information have been recorded about a great many
instances of the same object. Record-based files are used to record this
information.
Files, Records, Fields
In
record-based files;
vWe use the term file, to
describe a collection of one or more occurrences (instances) of a record type
(template). 
vWe use the term record, to
describe a collection of fields which record information about an object.
vWe use the term field, to
describe an item of information recorded about an object (e.g. StudentName,
DateOfBirth). 
Record instance vs Record type
It is important to distinguish between a record occurrence
(i.e. the values of a record) and the record type or template (i.e. the
structure of the record). 
Each record occurrence in a file will have a different
value but every record in the file will have the same structure. 
For
instance, in the student details file, illustrated below, the occurrences of
the student records are actual values in the file. The record type/template
describes the structure of each record occurrence.

The record buffer
Before
a computer can do any processing on a piece of data, the data must be loaded
into main memory (RAM). The CPU can only address data that is in RAM.
A
record-based file may consist of hundreds of thousands, millions or even tens
of millions of records, and may require gigabytes of storage. Files of this
size cannot be processed by loading the whole file into memory in one go.
Instead, files are processed by reading the records into memory, one at a time.
To
store the record read into memory and to allow access to the individual fields
of the record, a programmer must declare the record structure (see the diagram
above) in his program. The computer uses the programmer's description of the
record (the record template) to set aside sufficient memory to store one
instance of the record. The memory allocated for storing a record is usually called
a "record buffer".
A
record buffer is capable of storing the data recorded for only one instance of
the record. To process a file a program must read the records one at a time
into the record buffer. The record buffer is the only connection between the program
and the records in the file.

If
a program processes more than one file, a record buffer must be defined for
each file. 
To
process all the records in an INPUT file, we must ensure that each record
instance is copied (read) from the file, into the record buffer, when required.
To
create an OUTPUT file containing data records, we must ensure that each record
is placed in the record buffer and then transferred (written) to the file. 
To
transfer a record from an input file to an output file we must read the record
into the input record buffer, transfer it to the output record buffer and then
write the data to the output file from the output record buffer. This type of
data transfer between ‘buffers’ is quite common in COBOL programs.
Declaring Records and Files:

Creating a record
To
create a record buffer large enough to store one instance of a record,
containing the information described above, we must decide on the type and size
of each of the fields.
·The student identity number is 7
digits in size so we need to declare the data-item to hold it as PIC 9(7). 
·To store the student name, we will
assume that we require only 10 characters. So we can declare a data-item to
hold it as PIC X(10). 
·The date of birth is 8 digits long
so we declare it as PIC 9(8). 
·The course code is 4 characters
long so we declare it as PIC X(4). 
·Finally, the gender is only one
character so we declare it as PIC X. 
The
fields described above are individual data items but we must collect them
together into a record structure as follows;
01 StudentRec.
02 StudentId PIC 9(7).
02 StudentName PIC X(10).
02 DateOfBirth PIC 9(8).
02 CourseCode PIC X(4).
02 Gender PIC X.
The
record description above is correct as far as it goes. It reserves the correct
amount of storage for the record buffer. But it does not allow us to access all
the individual parts of the record that we might require. 
For
instance, the name is actually made up of the student's surname and initials
while the date consists of 4 digits for the year, 2 digits for the month and 2
digits for the day .
To
allow us to access these fields individually we need to declare the record as
follows;
01 StudentRec.
02 StudentId PIC 9(7).
02 StudentName.
03 Surname PIC X(8).
03 Initials PIC XX.
02 DateOfBirth.
03 YOBirth PIC 9(4).
03 MOBirth PIC 99.
03 DOBirth PIC 99.
02 CourseCode PIC X(4).
02 Gender PIC X.
In
this description, StudentName is a group item consisting of Surname and
Initials, and DateOfBirth consists of YOBirth, MOBirth and DOBirth.
Declaring a record buffer in your
program
The
record type/template/buffer of every file used in a program must be described
in the FILE SECTION by means of an FD (file description) entry. The FD entry
consists of the letters FD and an internal name that the programmer assigns to
the file. 
So
the full file description for the students file might be;.
DATA DIVISION.

FILE SECTION.

FD StudentFile.
01 StudentRec.
02 StudentId PIC 9(7).
02 StudentName.
03 Surname PIC X(8).
03 Initials PIC XX.
02 DateOfBirth.
03 YOBirth PIC 9(4).
03 MOBirth PIC 99.
03 DOBirth PIC 99.
02 CourseCode PIC X(4).
02 Gender PIC X.
Note
that we have assigned the name StudentFile as the internal file name. The
actual name of the file on disk is Students.Dat. 
The SELECT and ASSIGN clause
Although
the name of the students file on disk is Students.Dat we are going to
refer to it in our program as StudentFile. How can we connect the name we are
going to use internally with the actual name of the program on disk?
The
internal file name used in a file's FD entry is connected to an external file
(on disk, tape or CD-ROM) by means of the SELECT and ASSIGN clause. The SELECT and
ASSIGN clause is an entry in the FILE-CONTROL paragraph in the INPUT-OUTPUT
SECTION in the ENVIRONMENT DIVISION.
ENVIRONMENT DIVISION.

INPUT-OUTPUT SECTION.

FILE-CONTROL.
SELECT StudentFile 
ASSIGN TO “STUDENTS.DAT”.

DATA DIVISION.

FILE SECTION.

FD StudentFile.

01 StudentRec.
02 StudentId PIC 9(7).
02 StudentName.
03 Surname PIC X(8).
03 Initials PIC XX.
02 DateOfBirth.
03 YOBirth PIC 9(4).
03 MOBirth PIC 99.
03 DOBirth PIC 99.
02 CourseCode PIC X(4).
02 Gender PIC X.
SELECT and ASSIGN syntax for Sequential
fil
The
Microfocus COBOL compiler recognizes two kinds of Sequential File organization 
LINE SEQUENTIAL

and 

RECORD SEQUENTIAL.
LINE
SEQUENTIAL files, are files in which each record is followed by the carriage
return and line feed characters. These are the kind of files produced by a text
editor such as Notepad. 
RECORD
SEQUENTIAL files, are files where the file consists of a stream of bytes. Only
the fact that we know the size of each record allows us to retrieve them. Files
that are not record based, can be processed by defining them as RECORD
SEQUENTIAL.
The
ExternalFileReference can be a simple file name, or a full, or a
partial, file specification. If a simple file name is used, the drive and
directory where the program is running is assumed but we may choose to include
the full path to the file. For instance, we could associate the StudentFile
with an actual file using statements like:
SELECT StudentFile 

ASSIGN TO "D:\Cobol\ExampleProgs\Students.Dat"



SELECT StudentFile 

ASSIGN TO "A:\Students.Dat"


File Handling Verbs:

Introduction
Sequential
files are uncomplicated. To write programs that process Sequential Files you
only need to know four new verbs - the OPEN, CLOSE, READ and WRITE.
You
must ensure that (before terminating) your program closes all the files it has
opened. Failure to do so may result in data not being written to the file or
users being prevented from accessing the file.
The OPEN verb
Before your program can access the
data in an input file or place data in an output file, you must make the file
available to the program by OPENing it. 
When you open a file you have to
indicate how you intend to use it (e.g. INPUT, OUTPUT, EXTEND) so that the
system can manage the file correctly. Opening a file does not transfer any data
to the record buffer, it simply provides access.
OPEN notesWhen a file is opened for INPUT or EXTEND, the file must
exist or the OPEN will fail.
When a file is opened for INPUT, the Next
Record Pointer is positioned at the beginning of the file.
When the file is opened for EXTEND,
the Next Record Pointer is positioned after the last record in the file. This
allows records to be appended to the file. 
When a file is opened for OUTPUT, it
is created if it does not exist, and is overwritten, if it already exists.
The CLOSE verb
CLOSE InternalFileName...
You must ensure that, before
terminating, your program closes all the files it has opened. Failure to do so
may result in some data not being written to the file or users being prevented
from accessing the file.
The READ verb
Once the system has opened a file and
made it available to the program it is the programmers responsibility to
process it correctly. To process all the records in the file we have to
transfer them, one record at a time, from the file to the file's record buffer.
The READ is provided this purpose.
The READ copies a record
occurrence/instance from the file and places it in the record buffer. 
READ notes When the READ attempts to read a record from the file and
encounters the end of file marker, the AT END is triggered and the StatementBlock
following the AT END is executed. 
Using the INTO Identifier
clause, causes the data to be read into the record buffer and then copied from
there, to the Identifier, in one operation. When this option is used,
there will be two copies of the data. One in the record buffer and one in the Identifier.
Using this clause is the equivalent of executing a READ and then moving the
contents of the record buffer to the Identifier.
 How the READ works
When a record is read it is copied from the backing
storage file into the record buffer in RAM. When an attempt to READ detects the
end of file the AT END is triggered and the condition name EndOfFile is set to
true. Since the condition name is set up as shown below, setting it to true
fills the whole record with HIGH-VALUES. 

FD StudentFile.

01 StudentRec.

88 EndOfFile VALUE HIGH-VALUES.



02 StudentId PIC 9(7).

etc
The WRITE verb
WRITE RecordName [FROM
Identifier]
The
WRITE verb is used to copy data from the record buffer (RAM) to the file on
backing storage (Disk, tape or CD-ROM).
To
WRITE data to a file we must move the data to the record buffer (declared in
the FD entry) and then WRITE the contents of record buffer to the file.
When
the WRITE..FROM is used the data contained in the Identifier is copied
into the record buffer and is then written to the file. The WRITE..FROM is the
equivalent of a MOVE Identifier TO RecordBuffer statement followed by a WRITE
RecordBuffer statement.
Read a file, Write a record
If
you were paying close attention to the syntax diagrams above you probably
noticed that while we READ a file, we must WRITE a record.
The
reason we read a file but write a record, is that a file can contain a number of
different types of record. For instance, if we want to update the students file
we might have a file of transaction records that contained Insertion records
and Deletion records. While the Insertion records would contain all the student
record fields, the Deletion only needs the StudentId. 
When
we read a record from the transaction file we don't know which of the types
will be supplied; so we must - READ Filename. It is the programmers
responsibility to discover what type of record has been supplied. 
When
we write a record to the a file we have to specify which of the record types we
want to write; so we must - WRITE RecordName.
Tables OR Array:
Tables and Occurs
A powerful feature of COBOL is the use of tables,
via the "OCCURS" and "OCCURS DEPENDING ON"
clauses.  This section describes COBOL Tables and the OCCURS and OCCURS
DEPENDING ON clauses, both of which cause fields or groups to repeat some
number of times. urs Depending On
Tables and the OCCURS clause
Suppose you wanted to store
your monthly sales figures for the year. You could define 12 fields, one for
each month, like this: 
   05  MONTHLY-SALES-1    PIC S9(5)V99.


   05  MONTHLY-SALES-2    PIC S9(5)V99.



   05  MONTHLY-SALES-3    PIC S9(5)V99.



   ...



   05  MONTHLY-SALES-11   PIC S9(5)V99.



   05  MONTHLY-SALES-12   PIC S9(5)V99.
But there's an easier way in COBOL.  You can
specify the field once and declare that it repeats 12 times. 
You do this with the OCCURS clause, like this: 
   05  MONTHLY-SALES  OCCURS 12 TIMES  PIC S9(5)V99.
(By now you should also know
this can be written on two lines like this): 
   05  MONTHLY-SALES  OCCURS 12 TIMES  

                                   PIC S9(5)V99.
This specifies 12 fields, all
of which have the same PIC, and is called a table (also called an array). 
The individual fields are referenced in COBOL by using subscripts, such
as "MONTHLY-SALES(1)".  This table occupies 84 bytes in the
record (12 * (5+2)). (The sign is embedded, not separate, and the decimal is
implied.) 
The OCCURS can also be at the group level, and this
is the most useful application of OCCURS.  For example, all 25 line items
on an invoice (75 fields) could be held in this group: 
   05  LINE-ITEMS OCCURS 25 TIMES.

       10  QUANTITY            PIC 9999.

       10  DESCRIPTION         PIC X(30).

       10  UNIT-PRICE          PIC S9(5)V99.
Notice the OCCURS is listed at
the group level, so the entire group occurs 25 times. The order of the data in
the file is as-if you had specified multiple groups, like this: 
   05  LINE-ITEMS-1.
       10  QUANTITY            PIC 9999.
       10  DESCRIPTION         PIC X(30).
       10  UNIT-PRICE          PIC S9(5)V99.

   05  LINE-ITEMS-2.
       10  QUANTITY            PIC 9999.
       10  DESCRIPTION         PIC X(30).
       10  UNIT-PRICE          PIC S9(5)V99.

      ...

   05  LINE-ITEMS-25.
       10  QUANTITY            PIC 9999.
       10  DESCRIPTION         PIC X(30).
       10  UNIT-PRICE          PIC S9(5)V99.
There can be nested occurs -- an occurs
within an occurs.  In the next example, suppose we stock ten products and
 we want to keep a record of the monthly sales
of each product for the past 12 months. We could do just that with this table: 
   01  INVENTORY-RECORD.

       05  INVENTORY-ITEM OCCURS 10 TIMES.

           10  MONTHLY-SALES OCCURS 12 TIMES  PIC 999.
In this case,
"INVENTORY-ITEM" is a group composed only of
"MONTHLY-SALES", which occurs 12 times for each occurrence of an
inventory item.  This gives an array (table) of 10 * 12 fields.  The only
information in this record are the 120 monthly sales figures -- 12 months for
each of 10 items. 
We could also have a description for each item. The
description would go under the 05 level INVENTORY-ITEM group, at the 10 level,
the same as the monthly sales.  Further, we could track, say, the sale
price of each item for each month.  A record which will do these things
is: 
   01  INVENTORY-RECORD.

       05  INVENTORY-ITEM OCCURS 10 TIMES.

           10  ITEM-DESCRIPTION               PIC X(30).

           10  MONTHLY-SALES OCCURS 12 TIMES.

               15  QUANTITY-SOLD              PIC 999.

               15  UNIT-PRICE                 PIC 9(5)V99.
Notice we have made MONTHLY-SALES
a group, which now contains two fields, and the whole group repeats 12 times
for each instance of INVENTORY-ITEM.  This short layout has 250 fields:
two fields (QUANTITY-SOLD and UNIT-PRICE) that repeat 12 times for each
inventory item, times 10 items, plus the ITEM-DESCRIPTION field for each of the
10 items.  Fields and groups can be nested several levels deep, and it's
possible to have thousands of fields in a layout only a couple pages long. 
Occurs Depending On
One really great feature of
COBOL tables, and a really nasty one to convert to other languages, is the
"OCCURS DEPENDING ON".  This is an OCCURS, like above, but the
number of times it occurs in a particular record can vary (between some
limits). The number of times it actually occurs in any particular record will
be given by a value in another field of that record. This creates records that
vary in size from record to record. 
The OCCURS-DEPENDING-ON can include many
subordinate fields and groups, all of which occur multiple times. 
Further, most compilers allow one or more (fixed) OCCURS to be nested within an
OCCURS-DEPENDING-ON, and some compilers allow multiple OCCURS-DEPENDING-ON to
be nested, or to occur in succession.  This can get pretty involved, so we
will only give one simple example, that of a patient's medical
treatment-history record . 
   01  PATIENT-TREATMENTS.

       05  PATIENT-NAME                PIC X(30).

       05  PATIENT-SS-NUMBER           PIC 9(9).

       05  NUMBER-OF-TREATMENTS        PIC 99 COMP-3.

       05  TREATMENT-HISTORY OCCURS 0 TO 50 TIMES

              DEPENDING ON NUMBER-OF-TREATMENTS

              INDEXED BY TREATMENT-POINTER.

           10  TREATMENT-DATE.

               15  TREATMENT-DAY        PIC 99.

               15  TREATMENT-MONTH      PIC 99.

               15  TREATMENT-YEAR       PIC 9(4).

           10  TREATING-PHYSICIAN       PIC X(30).

           10  TREATMENT-CODE           PIC 99.
Here are the significant points
of this record:
vThe
name of the record is "PATIENT-TREATMENTS". 
vThe
first three fields "PATIENT-NAME", "PATIENT-SS-NUMBER", and
"NUMBER-OF-TREATMENTS" occur in the fixed portion of every
record.  This fixed portion is the same for every record. 
vThe
TREATMENT-HISTORY group is the variable portion of the record. It can
occur from 0 to 50 times. 
v""NUMBER-OF-TREATMENTS"
is a number from 0 to 50 that tells us how many times the group
TREATMENT-HISTORY occurs in this record. 
vThe
value in NUMBER-OF-TREATMENTS is stored in a comp-3 packed format. This is very
common. Also very common is comp or binary format.  All of these are
binary data formats. 
vTREATMENT-HISTORY
is a group that is comprised of all the lower level fields beneath it. (Down to
the next 05 level, or the end of the record). 
vAll the
fields and groups within TREATMENT-HISTORY occur between 0 and 50 times. 
vBecause
0 is a valid number of occurrences, it is possible the variable portion of the
record is not present. 
vThe
"INDEXED BY TREATMENT-POINTER" clause may or may not be
present.  If present it tells the compiler the name of the variable
(TREATMENT-POINTER) to use as the index into the array.  If you don't understand
this, you can safely ignore the "indexed by..." clause, unless you
are programming in COBOL. 
vTREATMENT-DATE
is a group that is comprised of the day, month, and year fields beneath it. 
vThese
records vary in size from 41 to 2041 bytes, and would be stored in some type of
variable length file.

MAINFRAMES

Monday, 23 May 2011

cobol

COBOL

Introduction

How widely used is COBOL?

Surprised by COBOL's success?

Characteristics of COBOL.

COBOL is non-proprietary (portable)

COBOL is Maintainable

Introduction

COBOL syntax

COBOL coding rules

Name construction

The structure of COBOL programs

The PROCEDURE DIVISION

Conditional Processing

COBOL Data Types

Introduction

Variables

Variable Data types

Literals

Sequential files

Record instance vs Record type

The record buffer

Introduction

The READ verb

Tables and the OCCURS clause

Occurs Depending On

No comments:

Post a Comment