Monday, September 26, 2011

JCL : Merging files

Question
"I have 2 files which have 1:1 records....so I don't need any match between them....

What I want to do is....extract a few columns from file A and a few from File B, and put them into file3. That's it.


Option1
a. Assume the record length is 80 for both files. Add seq no to both of them - say first 5 bytes is seq no.
b. SORT on SEQ NO, with these two files concatenated. Option EQUALS preferred. Now we have the records as follows 

       00001AAAAAAAAAAAAAAAAAAAAA1
       00001BBBBBBBBBBBBBBBBBBBBB1
       00002AAAAAAAAAAAAAAAAAAAAA2
       00002BBBBBBBBBBBBBBBBBBBBB2, etc
    (Remember LRECL will be 85 now - SEQ no. + old 80 bytes)

c. Now SORT (COPY will do) this output into another dataset, by giving LRECL=170 instead of 85, in the input file's definition (a simple attempt to trick MVS). Now you should get the file as follows:

       00001AAAAAAAAAAAAAAAAAAAAA100001BBBBBBBBBBBBBBBBBBBBB1
       00002AAAAAAAAAAAAAAAAAAAAA200002BBBBBBBBBBBBBBBBBBBBB2, etc

d. Now you should be able to extact only the columns required, and of course remove the sequence number

Note: One more thing - for this option to work, BLKSIZE of the input file should be an EVEN multiple of the actual LRECL. Otherwise it may abend. So while creating the dataset, you can give a blocksize that is an even multiple of LRECL.

Option2:
I have two files with a key in bytes 1-3 and data in bytes 5-9.

File A has the following records: 

000 $$$$$
001 AAAAA
002 CCCCC
003 EEEEE
004 GGGGG

and File B has the following records: 

001 BBBBB
003 DDDDD
004 FFFFF
005 HHHHH

I want to join the data fields for pairs of records with the same key to get the following output: 

001 AAAAA BBBBB
003 EEEEE DDDDD
004 GGGGG FFFFF

Here's an ICETOOL job that can do it. The trick is to reformat the fields of the IN1 and IN2 files so you can do a BI SUM of the IN2 "join" fields with binary zeros in the corresponding IN1 fields. Since the reformatted IN1 file is first in the concatenation, all of its fields will be kept except those that you SUM (the "join" fields).

Note: For this example, the key and data fields are in the same locations in both files. But this same technique can be used if the key and/or data fields are in different locations in the files by reformatting the two files appropriately. 

//DFSORT  EXEC PGM=ICETOOL
//TOOLMSG  DD SYSOUT=*
//DFSMSG   DD SYSOUT=*
//IN1  DD DSN=... file 1
//IN2  DD DSN=... file 2
//TMP1 DD DSN=&&TEMP1,DISP=(,PASS),SPACE=(TRK,(5,5)),UNIT=SYSDA
//TMP2  DD DSN=&&TEMP2,DISP=(,PASS),SPACE=(TRK,(5,5)),UNIT=SYSDA
//CONCAT DD DSN=*.TMP1,VOL=REF=*.TMP1,DISP=(OLD,PASS)
//       DD DSN=*.TMP2,VOL=REF=*.TMP2,DISP=(OLD,PASS)
//TMP3  DD DSN=&&TEMP3,DISP=(,PASS),SPACE=(TRK,(5,5)),UNIT=SYSDA
//OUT   DD SYSOUT=*
//TOOLIN   DD *
* FOR THIS EXAMPLE, FIELDS (P,L) ARE AS FOLLOWS:
*   IN1:  SORT KEY - 1,3
*         FIELD1   - 5,5
*   IN2:  SORT KEY - 1,3
*         FIELD2   - 5,5
*
* WARNING: BI SUM FIELDS CAN ONLY BE 2, 4 OR 8 BYTES.
* IN THIS EXAMPLE, THE 5-BYTE FIELD2 IS PADDED TO 8 BYTES
* SO IT CAN BE SUMMED.
*
* REFORMAT THE IN1 DATA SET SO IT CAN BE JOINED
  COPY FROM(IN1) TO(TMP1) USING(CPY1)

* REFORMAT THE IN2 DATA SET SO IT CAN BE JOINED
  COPY FROM(IN2) TO(TMP2) USING(CPY2)

* SELECT ONLY RECORDS WITH MATCHING IN1/IN2 KEYS
  SELECT FROM(CONCAT) TO(TMP3) -
    ON(1,3,CH) ALLDUPS

* JOIN THE RECORDS
  SORT FROM(TMP3) USING(SRT1)
/*
//CPY1CNTL DD *
*  USE OUTREC TO CREATE: KEY FIELD1 ZEROS
 OUTREC FIELDS=(1:1,3,4:5,5,9:8Z)
//CPY2CNTL DD *
*  USE OUTREC TO CREATE: KEY FILLER FIELD2
 OUTREC FIELDS=(1:1,3,9:5,5,3Z)
/*
//SRT1CNTL DD *
*  SORT ON THE KEY AND SUM WITH BI TO JOIN FIELD2 IN EACH IN2
*  DUPLICATE RECORD WITH ZEROS IN EACH IN1 DUPLICATE RECORD.
*  ALL OF THE NON-SUMMED FIELDS FROM EACH IN1 DUPLICATE RECORD
*  WILL BE KEPT.
*  USE OUTFIL OUTREC TO REARRANGE THE FIELDS FOR FINAL OUTPUT.
 SORT FIELDS=(1,3,CH,A)
 OPTION EQUALS
 SUM FIELDS=(9,8),FORMAT=BI
 OUTFIL FNAMES=OUT,OUTREC=(1,3,X,4,5,X,9,5)
/*

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.