Thursday, December 22, 2011

COBOL: Internal and external sort

What is External SORT ?
   External sort refers to the direct and explicit usage of the SORT utility available at your site. That is, you explicitly code EXEC PGM=SORT in your JCL, and the SORT utility at your site (usually DFSORT or SYNCSORT) gets invoked.

What is Internal SORT ?
   Internal sort refers to indirect usage of the SORT utility available at your site. This is done using programming languages that have got the SORT verb like COBOL, PL/I, or even Assembler. COBOL as such does not have an in-built SORT product or algorithm. The COBOL SORT verb results in a call to an external Sort product(like DFSORT), which does the processing for COBOL. 

   Now, you must be familiar with the COBOL SORT verb. Looks like this:
 
   SORT filename ON ASCENDING KEY dataname
        USING filename1
        INPUT PROCEDURE IS paragraph1
        GIVING filename2
        OUTPUT PROCEDURE IS paragraph2

   (As you know, filename is the name given in the SD entry, filename1 is the input file, and filename2 is the output file)

The way in which COBOL interfaces with DFSORT depends on the use of COBOL features such as FASTSRT, NOFASTSRT, USING, GIVING and INPUT and OUTPUT PROCEDUREs, and DFSORT features such as COBOL exits and DFSORT control statements. 

FASTSRT/NOFASTSRT are compile-time options you can mention for the COBOL program. You should always specify the FASTSRT option. COBOL decides automatically at compile-time whether FASTSRT can actually be used. For example, FASTSRT cannot be used for input processing when an INPUT PROCEDURE is specified. 

When FASTSRT is in effect for a COBOL sort, DFSORT (rather than COBOL) reads the input data set and writes the output data set. This results in reductions in elapsed time, CPU time and EXCPs. 

If the COBOL FASTSRT option is in effect for input and output, then DFSORT called from COBOL can run as fast as DFSORT from JCL. FASTSRT is more efficient because it allows DFSORT to read the records directly rather than passing each record to DFSORT one at a time through COBOL generated E15/E35 routines. 

Disadvantages of Internal SORTS:
1. COBOL code is a little easier to maintain and test without internal sorts.
2. The Job is a little more "restartable". (For example if it is a really complex program that updates a few files, and does a SORT, if the SORT fails, the updates might have to be rolled back, before the job can be restarted.)
3. If a change in sort sequence is required, the COBOL program has to be recompiled.
4. Resources are held while the sort phase occurs (tape mounts, databases, VSAM). For example a program might open about 50 files for processing, and perform a sort on one of them. In this case, while the SORT is going on, all the 50 files are 'locked' and other jobs won't be able to use them.
5. Unless a compile-time options hands off I/O (like FASTSRT in COBOL), I/O is done with much slower data access methods.
6. Much more difficult to exploit the utility aspects of sort products (like INREC, SUM, OUTREC, etc) from within a program.
7. The fact that an internal sort is being done is not obvious when looking at the JCL alone.

One good reason you might want to use internal sorts:
1. You only have to "pass" the file once.
   Sometimes the requirements dictate that internal sorts are the common-sense choice. For example, suppose the input records have the department number on them, but the requirement is to present a report sequenced by alphabetical department name, which is stored on a DB2 table. The simplest solution is to read the records and access DB2 to add the department name (in an input procedure), then sort inside the COBOL program, writing the report in the output procedure. (Of course you could split this into three "passes" - first a program that extracts the records required, second an external SORT to sort the extracted data in the required order, and finally a report program that reads the sorted data and generates the report - but having a single program that does all this sounds better, right?)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.