Sunday, December 23, 2012



What is assembly language:
You must have heard of microprocessors in which the machine understands only ones and zeros. i.e. the machine needs to be fed data as a series of ones and zeros or ON and OFF type data.

Let us imagine a very simple microprocessor which only understands two instructions
1.      Add two numbers and display the result on output.
2.      Subtract two numbers and display the result on output.

Assuming the command also called operation-code for the first instruction addition is byte ‘1000 0000’ and for the second instruction subtraction is Byte ‘1001 0000’

So if we give the command to our processor as

‘1000 0000’ ‘0000 0001’ ‘0000 00001’  this is called an op-code.
now the machine interprets the series of bits as follows
1st Byte 1000 0000    ->    Add the following two data bytes
2nd Byte 0000 00001 -> 1 decimal
3rd Byte 0000 00001 -> 1 decimal

So what happens is that it takes up the last two bytes and adds them and stores the result somewhere. This is hardcore machine language. Now it is very difficult to remember each command in the above form so something called a mnemonic was invented. A command very similar to english language words now represent each op-code.

Our statement above can be written like

ADD 1,1 which is much simpler to understand. ADD is the mnemonic for OP CODE ‘1000 0000’


OP-CODE and MNEMONIC


‘1000 0000’      ‘0000 0001’ ‘0000 00001’

<-OP- CODE-> ß------ OPERAND ---à 

Corresponding Assembler MNEMONIC is

ADD               1               ,     1          

 So languages are classified as follows:



          machine languages op codes

eg: op code= 90,01,01

means ADD ‘1’ to ‘1’

Low level languages:         
Assembler mnemonics have a one to
one correspondence with op-codes
                                Eg:    ADD  ‘1’,’1’  is  90,1,1

High Level Languages:     
Here one statement of the language generates many low level Op-codes or statements. Examples C, Java, etc.

NUMBER FORMATS

There are several number formats that we need to be familiar while using assembly language, so please refer the following slide:

 


1)         Binary ,Base is 2.  (valid digits are 0,1)         

            eg  1000 0010 stands for

1*2^7 + 0*2^6 +  0*2^5 +  0*2^4 +  0*2^3 +  0*2^2 +  1*2^1 +  0*2^0  
which is equal to 128 + 2 = 130 decimal.

2)             Octal , base is 8 ( valid digits are 0,1,2,3,4,5,6,7)

Eg: 014 stands for 1*8^1 + 4*8^0
Which is equal to 8 + 4 = 12 decimal.
Octal numbers are always written with a 0 as the leading digit.

3)         Decimal, base is 10 ( valid digits 0,1, …. 9)
this is our standard number system which we use daily.

3)         Hexadecimal, Base is 16 ( valid digits are 0,1,2,3,4,5,6,7,8,9,10,A,B,C,D,E,F)

            Here    A => 10
                        B => 11

                        C => 12

                        D => 13
                        E => 14
F => 15

eg: 0x1A stands for    1*16^1 + A*16^0 = 1*16^1 + 10*16^0 = 16 + 10 = 26 decimal.


EG:
            DECIMAL     BINARY        OCTAL          HEXADECIMAL    
135                  1000 0111       0207                0x87

Conversion of the various formats:

We will be basically interested in the following conversion

Decimal ß> Binary <-> HEX

NUMBER FORMAT CONVERSIONS:


Decimal to Binary Conversion

Decimal = 50
Conversion: Keep Dividing  50 by 2 if remainder is non zero write 1 in the bit position

Division                       Bit Position
-------------------------------------
                        |1         1         
          2            |3         1
          2            |6         0
          2            |12       0
          2            |25       1
          2            |50       0          Least Significant Bit

Corresponding Binary is  0011 0010
  

Binary to Hex Conversion

Binary = 1000 1011
Conversion:
Lower four bytes = 1011 = decimal 11 = B
Higher four bytes = 1000 = decimal 8

Corresponding HEX is 0x8B

Hex to Decimal conversion
Hex = 0x82
Conversion:
            0x82 = 8*16^1 + 2*16^0 = 128 + 32 = 160

Decimal is 160

Decimal to Hex         
Deimal = 2058

Conversion:
division                       remainder       

            |8         8
         16           |128     0
         16           |2058   10=A

Hex = 0x80A

Forms of data storage in the Mainframe: The charater data is stored in EBCDIC (Extended Binary Coded Decimal …) format ( similar to ASCII ), the numeric data can be stored in pure binary, zoned, decimal or packed decimal.

We will take examples to illustrate the data formats:

Charater Data requires one byte for one charater:

Charater        Stored As

‘A’                   0xC1

‘a’                    0x81
‘1’                    0xF1

String data:
STRING= “snytle” stored as 0xA2 0x950 xA8 0xA3 0x93 0x85 in six bytes.

Numeric Data in Binary format:

123 in binary is stored in one byte as 01111011                                                        

Numeric Data stored in Zoned decimal format:

123 stored in zoned decimal requires three bytes
it is stored as

0xF1 0xF2 0xC3

Similarly –123 is stored as 0xF1 0xF2 0xD3

Note that the first four bits of the LSByte indicates the sign of the data.

Numeric Data stored in Packed Decimal format:

The same data 123 stored in packed decimal format will result in more efficient storage.

123 is stored as 0x12 0x3C thus we require one byte less here.


Concept of Memory Addressing:
We should know the basics of memory storage to understand addressing, memory is referred by the size of data it can store like BYTE, HALF WORD, WORD, DOUBLE WORD.

Each Byte in memory can be individualy pointed to by its address.

The address of each Half Word is the address of its Most Significant Byte ( left most byte)

The address of  Word is the address of its Most Significant Byte ( left most byte)

The address of Double Word is the address of its Most Significant Byte ( left most byte)

In most OS 370/390 systems the maximum memory that can be addressed is
(2^24 – 1) bytes, which is 16 MB.

BITS AND BYTES


BYTE:                                   8 bits   (1 byte)

 0 0 0 0 0 0 0 0


HALF WORD:                       16 bits (2 bytes)

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Most S. Byte


WORD:                                   32 bits (4 bytes)

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Most S. Byte


DOUBLE WORD:     64 bits (8 bytes)
               
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Most S. Byte

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0



 The Memory map for the mainframe system looks like this:
Note that each HALF WORD has an address which is exactly divisible by 2.
Also note that each WORD has an address which is exactly divisible by 4.
The total memory shown is 16MB = 2^24 –1 bytes, Each byte has a 24 bit address of its own.
The first address is 0x000000 and the last byte address is 0xFFFFFF.

               
memory map.bmp

Registers used in Mainframe:
There are 16 general purpose 32 bit memory locations  used in the mainframe, these locations are called registers,  each has a four bit binary address R0 = 0000, R1=0001,R2=0010…. R15 = 1111.
The registers arew referred to by their binary address. 0x01 represents R1 and so on.



REGISTERS


·  There are a total of 16 registers in the mainframe.

·  Each register size is 32 bits, i.e.  4 bytes.

·  Registers are reffered to as R0, R1, R2 . . . . . R15.

·  The registers R0, R1, R12, R13, R14 and R15 are used for a special purpose.  All the other registers can be used by the programmer for general programming.

·  The address of the registers are 0,1,2, …. 15.

·  There is one more memory location which has a special use and it is called the “ program status word ” This stores the current memory location of the executing program and is used to pinpoint the currently executing instruction. The ‘P.S.W’ also stores other data related to current state of the CPU – Condition code , etc. The size of the PSW is 16 bytes. the lower 8 bytes stores the address of the current instruction.

Base Displacement form of Addressing:
The memory that you have seen in slide#8 is accessed in the program in a unique way called base displacement form.

The total bits required to address each memory byte is 24 bits because there are 2^24 – 1 bytes. But the OS/370 OS/390 system has allocated only 16 bits to act as a memory pointer to access the whole of the 2^24 –1 byte memory.

The 16 bits are divided in to two parts, the base of 4 bits and the displacement of 12 bits.
So the memory is mapped into segments of 2^12 (4096) bytes each and each segments absolute address is kept in a register pointed to by the first 4 bits. Thus a total memory
Of  4096 * 16 = 65536 can be mapped (addressed) if we use all the registers we have. This is still less than the total of  2^24 = 16,777,216 but then we seldom need that much of memory.

The advantage of the base displacement form of addressing is that programs can be made relocatable.



Base Displacement Addressing Scheme:




Base 4 bits
Displacement 12 bits



Points to a Register R0 – R15            The 12 bit displacement can address a maximum area of 4096 Bytes. ( 2^ 12)


Memory Address =  Contents of the Register pointed to by first four bytes + displacement.

Example:

0 0 1 0
0 0 0 0 0 0 0 0 1 1 0 0
Base                            Displacement

Memory Address = contents of R4   +   12
                             = 2005   + 12       decimal (assuming R4 contains 2005 )      
                             = 2017        decimal.                                  

Assembly statement format:
The assembler language is column sensitive so we have to be careful about the placement of the commands. The assembler  interprets the symbol by referring to the column number in which it is found.

ASSEMBLER STATEMENT FORMAT 1.


INSTRUCTION FORMAT.bmp

ASSEMBLER STATEMENT FORMAT 2.

instruction 2.bmp



INSTRUCTION TYPES:

The MVS Assembler language statements are divided into the following types depending upon the type of operands, memory addressing, number of bytes etc.

rr TYPE.bmp
·  LENGTH OF THE INSTRUCTION IS 16 BITS, THERE ARE TWO OPERANDS
·  BOTH OPERANDS ARE REGISTERS.

Example:
 
BALR R14,R15    OPCODE R1,R2

            The corresponding  instruction in HEX is

                                                                                                               
                     0X05
                        0XE
                     0XF




RS TYPE.bmp


RX TYPE.bmp












SI TYPE.bmp
SS1 TYPE.bmp









SS2 TYPE.bmp
Implicit addressing: The above examples that we have seen all support explicit addressing, i.e. we are specifying the actual address in the base displacement form ( D1(B1) ), but we can use a symbol to point to this address instead of specifying the register and displacement. The symbol can be thought of as a label for that memory location.
So now we can actualy use meaningful variables to represent the memory locations, the following examples illustrate implicit addressing:


Let us take the example of the LOAD instruction that we had seen before.

 

L         R1,D1(X2,B2)


Replacing D1(X2,B2) by a variable MYMEMORY we will have,


L         R1, MYMEMORY


MYMEMORY  is a variable name given to address  = contents of B2 + displacement D1.

(index is taken as zero), so the effective statement is the same i.e register R1 is loaded with the WORD (4 bytes) located at the address MYMEMORY.






PREPARING AN ASSEMBLY PROGRAM FOR RUN:

The source code written in Mnemonic form has to be assembled by using an assembler program, it is similar to the compiler used for higher level languages..


The source program is assembled by the assembler into object code in two passes, the first pass assigns values to symbols and addresses to literals, the second creates the obect module and listing file.
The listing file will display the object code produced in the object file and all source module statements, a cross reference list of all literals and symbols used in the program and a diagnostic section showing errors occurred during assembly.
The final Load module or executable code is produced by the linkeditor which resolves all the addresses to called subroutines and modules.





Source program





                                                                        Assemble









          Object Module                                           Listing file             

                                                LinkEdit                                            




       Load Module























Tuning DB2 SQLs

Predicate

Predicates are found on WHERE, ON and HAVING clause of the SQL. ON predicates are applied first, then WHERE predicates and after data access HAVING predicates are applied. Predicates on HAVING clause are not used when accessing data. Predicate types have great impact on choosing DB2 access path. General predicate types are Subquery, Equal, Range, IN-list and NOT.

The following two examples show how the predicate type can influence DB2's choice of an access path. In each one, assume that an unique index I1(C1) exists on table T1(C1,C2) and all values of C1 are positive integers.

The query,
SELECT C1, C2 FROM T1 WHERE C1 >= 0;
has a range predicate. However, the predicate does not eliminate any rows of T1. Therefore, it could be determined during bind that a table space scan is more efficient than idex scan.

The query,
SELECT * FROM T1 WHERE C1 = 0;
has an equal predicate. Db2 chooses the index access in this case, because only one scan is needed to return the result.

Predicates are also categorized into Indexable and non-indexable predicates. They clasified into stage 1 and stage 2 predicates depending on when they are used during query evaluation. For outer join, predicates on ON clause are treated as staged 2 predicates. And most of the other predicates are applied after JOIN as stage 2 predicates. Predicates on table expression can be evaluated before join as stage 1 predicate.

If the employee table has as index on the column LASTNAME, the following predicate can be a matching predicate:
SELECT * FROM EMP WHERE LASTNAME = 'SMITH';
The following predicate cannot be a matching predicate, because it is not indexable.
SELECT * FROM EMP WHERE SEX <> 'F';
To make your queries as efficient as possible, use indexable predicates in your queries and create suitable indexes on your tables. Indexable predicates allow the possible use of a matching indx scan, which is often a very efficient access path.

For example, in the following statement
SELECT * FROM (SELECT * FROM EMP WHERE EDLEVEL > 100) AS X FULL JOIN DEPT ON X.WORKDEPT = DEPT.DETPNO;
The predicate "EDLEVEL > 100" is evaluated before the full join and is a stage 1 predicate.

Order of predicate evaluation

Predicates are evaluated in the following sequence.
  • Indexable matching predicates are accessed first
  • Next is Indexable non-matching index.
  • All Indexable predicates are stage 1 predicates
  • After data page access, other stage 1 predicates are applied
  • Finally stage 2 predicates are applied in returned data rows
Consider a table T(C1,C2,C3,C4,C5,C6) and an index on it as X(C1,C2,C3)
WHERE C1=5 AND C2=7 AND C4=5
C1 and C2 are indexable and matching predicates. Hence they are stage 1, but C4=5 is not indexable but stage 1 predicate.
WHERE C1>5 AND C2=7
Both predicates are stage 1, but only the first matches the index. A matching index scan could be used with C1 as a matching column.
WHERE C1=5 OR C2=7
Both predicates are stage 1 but no boolean terms. The compound is not indexable except by a union of RID lists from two indexes and cannot be considered for matching index access.
WHERE C1=5 OR C2<>7
The first predicate is indexable and stage 1, and the second predicate is stage 1 but not indexable. The compound predicate is stage 1.
WHERE C1>5 OR C2=7
Both predicates are stage 1 but no boolean terms. The compound is not indexable except by a union of RID lists from two indexes and cannot be considered for matching index access.

Predicate evaluation within each stage:

  • All equal predicates are applied first
  • Range predicates are next
  • All other predicates at the last
  • After applying all the above mentioned rules, predicates are evaluated in the order, they appear in the query.

Filtering

Db2 will apply the most restrictive predicate first to reduce the processing at next stage for stage 1 predicates. It is better to code predicate with high filtering factor first. DB2 evaluates the filter factor based on the catalog information in SYSCOLUMNS and SYSCOLDIST tables.

Following column values of the catalog tables will be used for filtering
COLCARDF, HIGH2KEY, LOW2KEY of SYSIBM.SYSCOLUMNSCOLVALUE and FREQUENCYF of SYSIBM.SYSCOLDIST

Filer factor for uniform distribution:
Filter factor= 1/COLCARDF if predicates are like COL=literal, COL IS NOT NULL
                 = (No.of entries in IN LIST)/COLCARDF for COL IN(list)
                 = (Literal value - LOW2KEY)/(HIGH2KEY - LOW2KEY) for COL < or <= literal
                 = (HIGH2KEY - literal)/(HIGH2KEY - LOW2KEY) for COL > or >= literal
                 =(High literal - Low literal)/(HIGH2KEY - LOW2KEY) for COL LIKE or BETWEEN

For non-uniform distribution, COLVALUE and FREQUENCYF are used

If distribution of column value is not available in the catalog table SYSIBM.SYSCOLDIST, DB2 assumes normal distribution for filtering. Make sure the catalog tables are updated either manually or running RUNSTATS. If there is no information available from the catalog tables, DB2 assumes a default filter factor. It uses default filter factor for predicates in static SQLs using host variables.

For uniform distribution:
SELECT * FROM C1=10 AND C2>=7 AND C3 BETWEEN 100 AND 999. 
Assume CARDINALITY of C1=100, C2=200 and C3=1000. Minimum column value for C2=0 and maximum is 30 and for C3 they 0 and 5000 and assume normal distribution for all three columns.

Filter factor  For C1=10 is 1/100 = 0.01
                  For C2>=7 is (30-7)/30 = 23/30
                  For C3 predicate = (999-100)/(5000-0) = 899/5000

For non-uniform distribution:

Suppose that the predicate is C1 IN('3','5') and that SYSCOLDIST contains these values for column C1:

COLVALUE  FREQUENCYF
     '3'                .0153
     '5'                .0859

The filter factor is .0153 + .0859 = .1012

SYSCOLDIST statistics need to be collected together for correlated cols. If C1 and C2 are correlated and concatenated cols of an index. There is predicate C1=2 and C3=5, frequency needs to be collected for value 2 and 5 together. Simple addition would not give the correct filter factor in that case.


Column correlation

Tow columns of a table are said to be correlated if they are depend on each other. Db2 might not determine the optimum access path, table order,or join method when query uses highly correlated columns. Column correlation makes query cheaper than actually they are. Run RUNSTATS to update the catalog tables with correct correlation to help the DB2 to find actual filer factor.

Let us assume following example.
SELECT ... FROM CREWINFO WHERE CITY='FRESNO' AND STATE='CA' AND DEPNO='X344'  AND SEX = 'F';
with index 1 on CITY and STATE and index 2 on DEPTNO and SEX.
And say CARDINALITY for STATE is 3, for CITY =4, for DEPTNO is 4 and for SEX is 2.

Filter factor for Predicate 1 = 1/4 * 1/3 = 0.083
                 for Predicate 2 = 1/4 * 1/2 = 0.125

So Predicate 1 provides better filter factor and index 1 is selected.
Here columns STATE and CITY are highly correlated. So value of column CITY does depend on column value of STATE. So CITY does not provide much filtering as it is seemed to be. Actual filter factor would be 1/4 = 0.25. So index 2 will provide better access path.

Db2 can take care of column correlation provided the catalog table SYSIBM.SYSCOLDIST for the correlated columns together. The programmer has to determine if the columns are correlated and update the SYSCOLDIST table accordingly.


Using host variable effectively

If static SQL has host variables, Db2 might not select the optimum access path as it uses the default filter factor for the predicates using host variable. There are two ways to change the access path for query that contains host variables. 

  • Using REOPT(VARS) option to change the access path at run time.
  • Retwrite the SQL in a different way.

BIND option REOPT(VARS) will make DB2 to determine the access path both at runting and bind time. During runtime the DB2 can find out the filter factor effectively as value of the variable is known. Hence the query performance is optimized.

Be careful while using this everytime you run the SQL in the plan(or package) the access path is determined which could result in more performance overhead.

Rewriting the query: Static SQL with host variables suffers due to filter factor. 
SELECT * FROM EMP WHERE SEX=:HV1 
Here DB2 uses filter factor as 1/2. But actually the filter factor may be different if distribution of M and F are not normal.

Rewrite the query as follows.

IF HV1= 'M' then
SELECT * FROM EMP WHERE SEX='M';
else
SELECT * FROM EMP WHERE SEX=:'F'

Here DB2 will calculate actual filter factor considering distribution.

Writing subquery

Subquery could be correlated and non-correlated. For correlated subquery, for each row returned from the outer query, the subquery is evaluated. For non-correlated subquery, inner query s evaluated first and then the outer query. Db2 can somtime transfer the subquery to join and sometime application programmer has to do it for better performance. Any subquery could be transformed to a query.

Db2 makes that transformation only if:

  • The subquery appears in a WHERE clause
  • The subquery does not contain GORUP BY, HAVING or column functions
  • The subquery has only one table in the FROM clause
  • The subquery select list has only one column, guaranteed by a unique index to have unique values.
  • The comparion operator of the predicate containing the subquery is IN, = ANY or = SOME 
  • For a non-correlated subquery, the left side of the predicate is a single column with the same data type and length as the subquery's column.

Subquery:
SELECT * FROM EMP WHERE DEPTNO IN(SELECT DEPTNO FROM DEPT WHERE LOCATION IN ('SAN JOSE','SAN FRANCISCO') AND DIVISION = 'MARKETING');

Equivalent join:
SELECT EMP.* FROM EMP,DEPT WHERE EMP.DEPTNO = DEPT.DEPTNO AND DEPT.LOCATION IN ('SAN JOSE','SAN FRANCISCO') AND DEPT.DIVISION = 'MARKETING';
If you use columns from both the tables, better to use join.

Guidelines for writing efficient subquery:

  • If there are efficient indexes available on the tables in the subquery, then a correlated subquery is likely to be the most efficient kind of subquery.
  • If there are no efficient indexes available on the tables in the subquery, then a non-correlated subquery would likely perform better
  • If there are multiple subqueries in any parent query, make sure that the subqueries are ordered in the most efficient manner.

Ordering of subquery is very important:
SELECT * FROM MAIN_TABLE WHERE TYPE IN(subquery 1 as P1) AND PARTS IN(subquery 2 as P2);
DB2 evaluates the subquery as they appear in the query.

Let us say there are 1000 rows and P1 rejects 10% of total rows and P2 rejects 80% of rows. And P1 and P2 takes equal amount of time to execute.

So P1 is evaluated 1000 times and P2 is 900 times. So 1900 predicate checks. But if P1 and P2 are reversed, P2 is evaluated 1000 times and P1 is evaluated 200 times. So total 1200 predicate checks. Now if P1 is 100 times faster than P2 then it will be better to put P1 first and then P2.

OPTIMIZE FOR n ROWS


DB2 Chooses the access path that minimizes the response time for retrieving the first few rows. Using “OPTIMIZE FOR” does not stop the user accessing whole result set. This is not useful when DB2 has to gather whole result set before returning the first n rows.


OPTIMIZE FOR is not useful if DB2 has to gather whole result set before returning the first n rows as in following situation.

  • The query uses SELECT DISTINCT or a set function distinct, such as COUNT(DISTINCT C1
  • Either GROUP BY or ORDER BY is used, and there is no index that can give the ordering necessary
  • There is a column function and no GROUP BY clause.
  • The query uses UNION
Let us take the following example to illustrate how optimize for helps
SELECT LASTNAME, FIRSTNAME, EMPNO, SALARYFROM EMPLOYEE ORDER BY SALARY DESC;
If we assume that there is descending index available on salary, DB2 may go for TS scan if index is poorly clustered and sort the rows on the salary. If you use OPTIMIZE FOR 20 ROWS - DB2 would most likely use the SALARY index directly because you have indicated that you will probably retrieve the salaries of only the 20 most highly paid employees. This choice avoids a costly sort operation. If you have index on salary in ascending order, OPTIMIZE FOR does not help.

Influencing access path

DB2 evaluates the access path based on information available in catalog tables. Wrong catalog information or unavailable catalog information may result in selection of wrong access path. Wrong access path could be because of wrong index selection. It also could be of index selection where tablespace scan is effective.

If you have more than one index, available on a table. If DB2 chooses wrongly index IX1 over IX2, it is possible to make DB2 select IX2 over IX1. 
SELECT ... FROM CREWINFO WHERE    CITY = 'FRESNO' AND STATE = 'CA'    AND DEPTNO = 'A345' AND SEX = 'F';with  IX1 (CITY and STATE) and IX2 ( DEPTNO and SEX) 

DB2 will choose IX1 if this provides better filter factor which may not be correct because of wrong catalog information. Some index predicate of IX1 could be made stage 2 to predicate to discourage the use of IX1 by DB2 
  SELECT ... FROM CREWINFO WHERE CITY BETWEEN 'FRESNO' AND 'FRESNO'    (Predicate 1)  AND STATE = 'CA’ AND DEPTNO = 'A345' AND SEX = 'F';   

Now predicate 1 is stage 2.  So DB2 can not select it as matching columns of IX1.

Code extra predicate or change predicate to make DB2 select a different different access path

Following are various ways to make a predicate STAGE 2 predicate 

STAGE 1                       STAGE 2 
T1.C1 = 5                     (T1.C1 = 5 OR 0 = 1) 
T1.C1 = T2.C2               (T1.C1 = T2.C2 OR 0 = 1)
T1.C1 = ‘CHENNAI’         T1.C1 BETWEEN ‘CHENNAI’ AND ‘CHENNAI’
T1.C1 = ‘CHENNAI’         ‘CHENNAI’ BETWEEN T1.C1 AND T1.C1 

Other factors

Adding extra predicate may influence in selection of join method. If you have extra predicate, Nested loop join may be selected as DB2 assumes that filter factor will be high. The proper type of predicate to add is WHERE T1.C1 = T1.C1. Hybrid join is a costlier method. Outer join does not use hybrid join. So If hybrid join is used by DB2, convert inner join to outer join and add extra predicates to removes unneeded rows.

For example, suppose you want to obtain the results of the following inner join operation on the PARTS and 
PRODUCTS tables:
  SELECT PART, SUPPLIER, PARTS.PROD#, PRODUCT  FROM PARTS, PRODUCTS  ON PARTS.PROD# = PRODUCTS.PROD#;

DB2 creates a more efficient access path if the PARTS table is the outer table in the join operation. To make the PARTS table the outer table, use a left outer join operation. Include a WHERE clause in the query to remove the extraneous rows:
  SELECT PART, SUPPLIER, PARTS.PROD#, PRODUCT  FROM PARTS LEFT OUTER JOIN PRODUCTS  ON PARTS.PROD# = PRODUCTS.PROD#  WHERE PRODUCTS.PROD# IS NOT NULL;

Updating catalog tables

Access path based on catalog column values. Catalog tables could be updated manually or running RUNSTATS with appropriate options. Tables which are frequently changed, access method on them may suffer as statistics are not reflected. Running RUNSTATS is a costly process. So catalog statistics manually should be updated. It is necessary to rebind the static SQLs after catalog statistics update.

 Following are the catalog tables which influence on selecting access path.

TABLES                              COLUMNS                         
SYSTABLES                         CARDF, EDPROC, NPAGES, PCTPAGES, PCTROWCOMP
SYSTABLESPACE               INACTIVE
SYSINDEX                            CLUSTERED, CLUSTERING, CLUSTERRATIO, FIRSTKEYCARD,FULLKEYCARD, NLEAF, NLEVEL
SYSCOLUMNS                    COLCARDF, HIGH2KEY,  LOW2KEY
SYSCOLDIST                       COLVALUE, FREQUENCYF, TYPE, CARDF, COLGROUPCOLNO, NUMCOLUMNS 

DB2 EXPLAIN AND TUNING

EXPLAIN is a monitoring tool that produces information about a plan, package, or SQL statement when it is bound.  The output appears in a user-supplied table called PLAN_TABLE.

It helps you to do the following 
  • Design databases, indexes, and application programs
  • Determine when to rebind an application
  • Determine the access path chosen for a query

User has to create PLAN_TABLE or use the default one if available in your installation. Plan table could be populated in two ways 

  1. Run SQL EXPLAN either statically from application program or dynamically from SPUFI or QMF to get access path information in PLAN_TABLE.

  2. EXPLAIN PLAN SET QUERYNO = 13
         FOR SELECT X.ACTNO, X.PROJNO, X.EMPNO, Y.JOB, Y.EDLEVEL
             FROM DSN8510.EMPPROJACT X, DSN8510.EMP Y
                WHERE X.EMPNO = Y.EMPNO
                   AND X.EMPTIME > 0.5
                   AND (Y.JOB = 'DESIGNER' OR Y.EDLEVEL >= 12)
                ORDER BY X.ACTNO, X.PROJNO;

  3. You can populate a plan table when you bind or rebind a plan or package. Specify the option EXPLAIN(YES). EXPLAIN obtains information about the access paths for all explainable SQL statements in a package or the DBRMs of a plan. The information appears in table package_owner.PLAN_TABLE or plan_owner.PLAN_TABLE. For dynamically prepared SQL, the qualifier of PLAN_TABLE is the current SQLID.

DB2 EXPLAIN OUTPUT

Explain output is stored in PLAN_TABLE. Each plan is identified by APPLNAME column. Each package is identified by PROGNAME, COLLID and VERSION. In each package, you might have multiple SQLs and each is identified by QUERYNO. For each query will be evaluated in multiple stages and each stage is identified by QBLOCKNO and PLANNO.

Explain output for a plan :

The rows for a particular plan are identified by the value of APPLNAME. 
SELECT * FROM JOE.PLAN_TABLE  WHERE APPLNAME = 'APPL1’ ORDER BY TIMESTAMP, QUERYNO, QBLOCKNO, PLANNO, MIXOPSEQ;

Explain output for a package : 

The rows for a particular package are identified by the values of PROGNAME, COLLID, and VERSION. Those columns correspond to the following four-part naming convention for packages:
SELECT * FROM JOE.PLAN_TABLE WHERE PROGNAME = ‘PACK1’ AND COLLID = 'COLL1' AND VERSION =  'PROD1’ ORDER BY QUERYNO, QBLOCKNO, PLANNO, MIXOPSEQ;

Explain output for a query : 

If you use dynamic explain stmt under SPUFI or QMF, query number is specified in explain stmt.  
SELECT * FROM JOE.PLAN_TABLE WHERE QUERYNO = queryno ORDER BY QUERYNO, QBLOCKNO, PLANNO, MIXOPSEQ;

MIXOPSEQ maintains the order of evaluation in multiple index scan.
QBLOCKNO specifies which query is being executed when there is more than query (Subquery and Union). PLANNO specifies order within each QBLOCKNO.

Tablespace Scan (ACCESSTYPE = R)

A matching index scan is not possible because an index is not available, or there are no predicates to match the index columns. High percentage of the rows in the table is returned. In this case an index is not really useful, because most rows need to be read anyway. The indexes that have matching predicates have low cluster ratios and are therefore efficient only for small amounts of data. Sequential prefetch is used (PREFETCH=S).

Assume that table T has no index on C1. The following is an example that uses a tablespace scan:
   SELECT * FROM T WHERE C1 = VALUE; 
Or if there is an index on C1 but the predicate does not provide good filtering factor. 

Tablespace Scan on non-segmented tablespace :

DB2 reads and examines every page in the table space, regardless of which table the page belongs to. It might also read pages that have been left as free space and space not yet reclaimed after deleting data.

Tablespace Scan on segmented tablespace :

If the table space is segmented, DB2 first determines which segments need to be read. It then reads only the segments in the table space that contain rows of T. If the prefetch quantity, which is determined by the size of your buffer pool, is greater than the SEGSIZE, and if the segments for T are not contiguous, DB2 might read unnecessary pages. Use a SEGSIZE value that is as large as possible, consistent with the size of the data. A large SEGSIZE value is best to maintain clustering of data rows. 

Tablespace Scan on partitioned tablespace. 

Partitioned table spaces are nonsegmented. A table space scan on a partitioned table space is more efficient than on a nonpartitioned table space.

Using Index

Index to be defined should be solely based on how does application fetch data. Proper definition of index will avoid sort.  You need to trade cost of defining index and performance.

Cost of defining index :

  • Indexes require storage space.
  • Each index requires an index space and a data set, and there are operating system restrictions on the number of open data sets.
  • If you have concurrent users of the same table, locking problems are likely with multiple type 1 indexes. Type 2 indexes can sometimes give high concurrency and better performance by locking the underlying data page or record instead of locking  the index page.
  • Indexes must be changed to reflect every insert or delete operation on the base  table. If an update operation updates a column that is in the index, then the index must also be changed. The time required by these operations increases accordingly, especially for type 1 indexes with many duplicate values, either for particular keys or across the whole key range. A type 2 index is the better choice.
  • Indexes can be built automatically when loading data, but this takes time. They    must be recovered if the underlying table space is recovered, which is also time consuming.

Matching Index Scan

Match index scan provide filtering. This is possible if predicates are specified on either the leading or all of the index key columns. If degree of filtering is high. Matching index scan is efficient. MATCHCOLS will provide the number of matching columns. If there are more than one index, DB2 will use IX with most restrictive filtering for matching index scan.

Matching Index Scan Example:  
Assume there is an index on T(C1,C2,C3,C4):
       SELECT * FROM T       WHERE C1=1 AND C2>1       AND C3=1;
There are two matching columns in this example. The first one comes from the predicate C1=1, and the second one comes from C2>1. The range predicate on C2 prevents C3 from becoming a matching column.

Note :  If a matching predicate is a range predicate, then there can be no more matching columns

Consider the following example
SELECT * FROM T WHERE C2 = 5 AND C3 = 10 
There is no matching predicates for this query and leading column in the index C1 is not used in predicate. For this DB2 will scan the whole index which is called INDEX SCREENING

Non-Matching IX scan 

This is also called Index Screening. DB2 select index screening when predicates are specified on index key columns but are not part of the matching columns. Index screening predicates improve the index access by reducing the number of rows that qualify while searching the index. MATCHCOLS = 0 and ACCESSTYPE = I

For example, with an index on T(C1,C2,C3,C4):
     SELECT * FROM T    WHERE C1 = 1    AND C3 > 0 AND C4 = 2    AND C5 = 8;
C3>0 and C4=2 are index screening predicates. They can be applied on the index, but they are not matching predicates. C5=8 is not an index screening predicate, and it must be evaluated when data is retrieved. The value of MATCHCOLS in the plan table is 1.

Because a non-matching index usually provides no filtering, there are only a few cases when it is an efficient access path. The following situations are examples:

  • When there are index screening predicates
  • When the clause OPTIMIZE FOR n ROWS is used
  • That clause can sometimes favor a non-matching index, especially if the index gives the ordering of the ORDER BY clause.
  • When there is more than one table in a non-segmented table space In that case, a table space scan reads irrelevant rows. By accessing the rows through the non-matching index, there are fewer rows to read.

IN LIST Index Scan 

An IN-list index scan is a special case of the matching index scan, in which a single indexable IN predicate is used as a matching equal predicate. PLAN TABLE shows MATCHCOLS > 0 and ACCESSTYPE = N 

You can regard the IN-list index scan as a series of matching index scans with the values in the IN predicate being used for each matching index scan. 

The following example has an index on (C1,C2,C3,C4) and might use an IN-list index scan:
  SELECT * FROM T  WHERE C1=1 AND C2 IN (1,2,3)  AND C3>0 AND C4<100;
The plan table shows MATCHCOLS = 3 and ACCESSTYPE = N. The IN-list scan is performed as the following three matching index scans:
  
 (C1=1,C2=1,C3>0), (C1=1,C2=2,C3>0), (C1=1,C2=3,C3>0)

Multiple Index Scan 

Multiple index access uses more than one index to access a table. It is a good access path when No single index provides efficient access OR A combination of index accesses provides efficient access. LIST Sequential prefetch is used as RIDs are collected from each index scan. ACCESSTYPE = M,MI,MU,MX and PREFETCH = L. Same index also may be scanned more than ones.

DB2 chooses multiple index access for the following query:
    SELECT * FROM EMP    WHERE (AGE = 34) OR    (AGE = 40 AND JOB = 'MANAGER');
For this query:

EMP is a table with columns EMPNO, EMPNAME, DEPT, JOB, AGE, and SAL.
EMPX1 is an index on EMP with key column AGE.
EMPX2 is an index on EMP with key column JOB.

Stages of evaluation by DB2 : 

  1. Index EMPX1, with matching predicate AGE= 34, provides a set of candidates for the result of the query. ACCESSTYPE = MX
  2. Index EMPX1, with matching predicate AGE = 40, also provides a set of candidates for the result of the query. ACCESSTYPE = MX
  3. Index EMPX2, with matching predicate JOB='MANAGER', also provides a set of candidates for the result of the query. ACCESSTYPE = MX
  4. INTERSECTION on RIDs returned from 2 and 3. ACCESSTYPE = MI
  5. UNION on RIDs returned from 1 and 4. ACCESSTYPE = MU
Note: RID is Row Identifier in DB2 pages

One Fetch Access 

One-fetch index access requires retrieving only one row. It is the best possible access path if available.

One-fetch index access is a possible when :

  • There is only one table in the query. 
  • The column function is either MIN or MAX and There is an ascending index column for MIN, and a descending index column for MAX.
  • Either no predicate or all predicates are matching predicates for the index. And There is  no GROUP BY. 

One more requirement is as follows 
COL FUNC on 

  • The first index column if there are no predicatesThe last matching column of the index if the last matching predicate is a range type
  • The next index column (after the last matching column) if all matching predicates are equal type.
The following queries use one-fetch index scan with an index existing on 
T(C1,C2 DESC,C3):

  SELECT MIN(C1) FROM T;
  SELECT MIN(C1) FROM T WHERE C1>5;
  SELECT MIN(C1) FROM T WHERE C1>5 AND C1<10;
  SELECT MAX(C2) FROM T WHERE C1=5;
  SELECT MAX(C2) FROM T WHERE C1=5 AND C2>5;
  SELECT MAX(C2) FROM T WHERE C1=5 AND C2>5 AND C2<10;
  SELECT MAX(C2) FROM T WHERE C1=5 AND C2 BETWEEN 5 AND 10;

Index Only access 

With index-only access, the access path does not require any data pages because the access information is available in the index. Because the index is almost always smaller than the table itself, an index-only access path usually processes the data efficiently. ACCESSTYPE = I AND INDEXONLY = Y.

The following SQL will provide index only access
  SELECT C1, C2 FROM T WHERE C1 > 0;  SELECT C1, C2 FROM T;  SELECT COUNT(*) FROM T WHERE C1 = 1;
with table T(C1,C2,C3,C4,C5) with an index X(C1,C2); 

JOIN 

A join operation retrieves rows from more than one table and combines them. The operation specifies at least two tables, but they need not be distinct. Application joins are called inner join, left outer join, right outer join and full outer join. DB2 internally uses three types of join method - Nested loop join, Merge Scan Join and Hybrid Join. Hybrid join is not used for OUTER join.

Nested Loop Join (METHOD =1)

Initially two tables are picked out and one is used as inner table and other one as composite table. For each row in outer table, inner table is scanned for matching rows in inner(New) table. And the composite table is prepared. This composite table is used as outer table at the next stage. Process continues until and unless all the tables have been selected. Composite table is sorted when order of join columns on both the table are not same.

Example : 
SELECT *  FROM T1, T2,T3WHERE  T1.C1 = T2.C1 AND T2.C2 = T3.C2 AND T3.C3 = 5
Say T1 has 20 rows, T2 100 rows and T3 has 10 millions rows. Also assume there are 2 million rows in T3 with TS.C3 = 5

T3 is chosen as first table of the join and T2 as next, then T2 will be scanned 2 million times. And T1 is last table of the join and will be scanned also 2 million times. 

DB2 will chose the smallest table T1 as first table and then T2. So T2 is scanned 20 times and composite table is formed. Now T3 is new table of the join and is scanned 20 times. 

DB2 need not scan the whole table if higher clustered index is available on the inner table. 

Nested loop join is efficient when

  • Outer table is small. Predicates with small filter factor reduces no of qualifying rows in outer table.
  • The number of data pages accessed in inner table is also small. 
  • Highly clustered index available on join columns of the inner table. 
  • This join method is efficient when filtering for both the tables(Outer and inner) is high.
For a case like the example below, with a unique index on T1.C2, DB2 detects that T1 has only one row that satisfies the search condition. DB2 makes T1 the first table in a nested loop join.
   SELECT * FROM T1, T2   WHERE  T1.C1 = T2.C1 AND T1.C2 = 5; 

Merge Scan Join (Method = 2)

DB2 scans both the tables in order of join column. If there is no efficient index to provide the order, DB2 might sort the either or both the tables. DB2 reads a row from the outer table and keep on reading the inner table as long as a match is there. When there is no match, DB2 reads another row from outer table. If outer table has a new value, DB2 searches ahead in the inner table. 

Consider following example
SELECT A,B,X,Y FROM OUTER, INNER WHERE A>=10 AND B = X
Say tables OUTER and INNER are very big in size and also A>=10 does not provide much filtering. So Nested loop join is not efficient as both inner and outer table are big. Also consider there is no index available on column B of OUTER table or X on inner table. 

Algorithm :

Sort OUTER on B and INNER on X  in same order
Do until end of OUTER table is reached
      Read a qualifying row from the OUTER table 
      IF the row has same value in the join column THEN
      read from previous position on INNER table while a match is there
      ELSE
      read from the next position on INNER table while a match is there  
      END-IF  
      previous position = starting position of INNER table read
      next position = starting position of INNER table read + no of row reads from  INNER table
END-DO

Merge scan is used when : 

  • Qualifying rows of inner and outer tables are large and join predicates also does not provide much filtering
  • Tables are large and have no indexes with matching columns
If we consider the previous example, Sorting of OUTER and INNER table is not required if some index is available on JOIN column B and X respectively in the same order. If index is available in join column of one table, order table needs to be sorted on join column in the same order. Here OUTER table is scanned only once and whole INNER table is not scanned for every row of the OUTER table. So it could be benefit from performance over nested loop join where inner table is scanned for every row of the OUTER table.  

Hybrid Join(Method=4)

It is only used for Inner Join and requires an index on the join column of inner table. Join the outer table with RIDs from the index on the inner table. Index of the inner table is scanned for each row in outer table. Sort the data on RID orders and retrieve the data from inner table using list prefetch. Concatenates data from inner table to form the resultant table. 

Consider the following example. OUTER(A,B) and INNER(X,Y)
SELECT A,B,X,Y FROM OUTER, INNER WHERE A>=10 AND B=X and there is an non-clustering index available on column X of INNER table. Also there are qualifying rows from OUTER table with duplicate values on column B.

If index is clustered, then SEQUENTIAL PREFETCH is efficient. So HYBRID join is not used as it uses LIST PREFETCH method for fetching pages of INNNER table. 

Algorithm:
DO until end of OUTER table is reached
     Read a qualifying row form the OUTER table
     Scan the index on INNER table to find the RID (Row Identifier)
     Get of list of all RIDs and sort them order of page number
     Use LIST SEQUENTIAL PREFETCH to get those pages from INNER table
     Merge the data from INNER table with OUTER table to form the composite table
END-DO

HYBRID JOIN is efficient technique if qualifying rows from the OUTER table has duplicate values on join column as it will sort the RIDs for INNER table removing duplicates. So Data page of INNER are accessed only once for each value on join column of OUTER table.

Hybrid join is used often when a non-clustered index available on join column of the inner table and there are duplicate qualifying rows on outer table. Hybrid join handles are duplicates in the outer table as inner table is scanned only ones for each set of duplicate values. Prefetch method is LIST SEQUENTIAL.

Sequential Prefetch (Prefetch=S)

Sequential prefetch reads a sequential set of pages. The maximum number of pages read by a request issued from application program is determined by the size of the buffer pool used.

Sequential prefetch is generally used for a table space scan. For an index scan that accesses 8 or more consecutive data pages, DB2 requests sequential prefetch at bind time. The index must have a cluster ratio of 80% or above. 

For 4KB buffer pools

 Buffer Pool Size    Pages Read by Prefetch
    <=223 buffers       8 pages for each asynchronous I/O
    224-999 buffers     16 pages for each asynchronous I/O
    1000+ buffers       32 pages for each asynchronous I/O
For 32KB buffer pools
    Buffer Pool Size    Pages Read by Prefetch
    <=16 buffers        0 pages (prefetch disabled)
    17-99 buffers       2 pages for each asynchronous I/O
    100+ buffers        4 pages for each asynchronous I/O

List Sequential (Prefetch=L)

List sequential prefetch reads a set of data pages determined by a list of RIDs taken from an index. Usually with a single index that has a cluster ratio lower than 80%. Sometimes on indexes with a high cluster ratio, if the amount of data to be accessed is too small to make sequential prefetch efficient, but large enough to require more than one regular read. Always to access data by multiple index access or Hybrid join.

DB2 does not consider list sequential prefetch if the estimated number of RIDs to be processed would take more than 50% of the RID pool when the query is executed.

During execution, DB2 ends list sequential prefetching if more than 25% of the rows in the table (with a minimum of 4075) must be accessed.

When list sequential prefetch ends, the query continues processing by a method that depends on the current access path.

  • For access through a single index or through the union of RID lists from two indexes, processing continues by a table space scan.
  • For index access before forming an intersection of RID lists, processing continues with the next step of multiple index access. If there is no remaining step, and no RID list has been accumulated, processing  continues by a table space scan. While forming an intersection of RID lists, if any list has 32 or fewer RIDs, intersection stops, and the list of 32 or fewer RIDs is used to access the data.

Sequential Detection

If DB2 does not choose prefetch at bind time, it can sometimes do that at execution time. The method is called sequential detection. If a table is accessed repeatedly using the same statement (SQL in a do-while loop), the data or index leaf pages of the table can be accessed sequentially. DB2 can use this technique if it did not choose sequential prefetch at bind time because of an inaccurate estimate of the no of pages to be accessed.

Sorting of data

Sort can happen on a new table or on the composite table. Sort is required by ORDER BY or GROUP BY clause. (SORTC_GROUPBY/SORTC_ORDERBY = Y). Sort is required to remove duplicates while DISTINCT or UNION is used. (SORTC_UNIQ=Y). During Nested loop and Hybrid join, composite table is sorted and Merge scan join, both of the tables might be sorted to make join efficient. (SORTN_JOIN/SORTC_JOIN=Y).

After you run EXPLAIN, DB2 sorts are indicated in PLAN_TABLE. The sorts can be either sorts of the composite table or the new table. If a single row of PLAN_TABLE has a 'Y' in more than one of the sort composite columns, then one sort accomplishes two things. (DB2 will not perform two sorts when two 'Y's are in the same row.) For instance, if both SORTC_ORDERBY and SORTC_UNIQ are 'Y' in one row of PLAN_TABLE, then a single sort puts the rows in order and removes any duplicate rows as well. The only reason DB2 sorts the new table is for join processing, which is indicated by SORTN_JOIN.
Sort is need for subquery processing. Result of the subquery is sorted and put into the work file for later reference by parent query.  DB2 sorts RIDs into ascending page number order in order to perform list prefetch. This sort is very fast and is done totally in memory. If sort is required during CURSOR processing, it is done during OPEN CURSOR. Once cursor is closed and opened, sort is to be performed again.

When a noncorrelated IN or NOT IN subquery is present in the query, the results of the subquery are sorted and put into a work file for later reference by the parent query. The results of the subquery are sorted because this allows the parent query to be more efficient when processing the IN or NOT IN predicate. Duplicates are not needed in the work file, and are removed.  Noncorrelated subqueries used with =ANY or =ALL, or NOT=ANY or NOT=ALL also need the same type of sort as IN or NOT IN subqueries. When a sort for a noncorrelated subquery is performed, you see both SORTC_ORDERBY and SORTC_UNIQUE in PLAN_TABLE. This is because DB2 removes the duplicates and performs the sort. SORTN_GROUPBY, SORTN_ORDERBY, and SORTN_UNIQ are not currently used by DB2.

View Merge

If query is using view, view name ultimately will be resolved to table name. This process is called view merge.  the statement that references the view is combined with the subselect that defined the view. This combination creates a logically equivalent statement. This equivalent statement is executed against the database.

View defining statement:
    CREATE VIEW VIEW1 (VC1,VC21,VC32)     AS SELECT C1,C2,C3 FROM T1     WHERE C1 > C3;     
View referencing statement:
    SELECT VC1,VC21    FROM VIEW1    WHERE VC1 IN (A,B,C);
The subselect of the view defining statement can be merged with the view referencing statement to yield the following logically equivalent statement.

    Merged statement:

    SELECT C1,C2 FROM T1
    WHERE C1 > C3 AND C1 IN (A,B,C);

View Materialization

Views can not be merged if view definition involves column functions. In that case view materialization is required. 
Done in two stages :

  1. The view's defining subselect is executed against the database and the results are placed in a temporary copy of a result table.
  2. The view's referencing statement is then executed against the temporary copy of the result table to obtain the intended result
 Views cannot always be merged. In the following statements:

     View defining statement:                                   View referencing statement:
     CREATE VIEW VIEW1 (VC1,VC2) AS           SELECT MAX(VC1)
    SELECT SUM(C1),C2 FROM T1                      FROM VIEW1;
    GROUP BY C2;

column VC1 occurs as the argument of a column function in the view referencing statement. The values of VC1, as defined by the view defining subselect, are the result of applying the column function SUM(C1) to groups after grouping the base table T1 by column C2. No equivalent single SQL SELECT statement can be executed against the base table T1 to achieve the intended result. There is no way to specify that column functions should be applied successively.

Merge performs better than materialization. For materialization, DB2 uses a table space scan to access the materialized temporary result. DB2 materializes a view or table expression only if it cannot merge.

To reduce the size of the temporary result, DB2 attempts to evaluate certain predicates from the WHERE clause of the view referencing statement at the first step of the process rather than at the second step. 

Query Parallelism

When DB2 plans to access data from a table or index in a partitioned table space, it can initiate multiple parallel operations to reduce the response time for data or processor-intensive queries.

Two types of parallelism - 

  1. Query I/O parallelism - Manages concurrent I/O request for a single query. 
  2. Query CP parallelism  - Enables multitasking within a single query. Query is broken into parts and processed.
To expand even farther the processing capacity available for processor-intensive queries, DB2 can split a large query across different DB2 members in a data sharing group. This is known as Sysplex query parallelism.

DB2 can use parallel operations for processing:

  • Static and dynamic queries.
  • Local and remote data access.
  • Queries using single table scans and multi-table joins.
  • Access through an index, by table space scan or by list prefetch.
  • Sort operations.
Parallel operations usually involve at least one table in a partitioned table space. Scans of large partitioned table spaces have the greatest performance improvements where both I/O and central processor (CP) operations can be carried out in parallel.

Partitioned vs... Nonpartitioned Table Spaces: Although partitioned table spaces show the most performance improvements, nonpartitioned table spaces might benefit in processor-intensive queries:

  • For a merge scan join, the join phase can be processed in parallel because the sort work files can be partitioned before performing the join. The partitioning of the work files is possible only if the hardware sort facility is available at run time.
  • In the nested loop join, DB2 is more likely to choose parallelism if the outer table is partitioned.
Parallel processing is enabled using

  • DEGREE(ANY) on BIND and REBIND for static
  • SET CURRENT DEGREE = ANY  for dynamic
  • The virtual buffer pool parallel sequential threshold (VPPSEQT) value must be large enough to provide adequate buffer pool space for parallel processing
  • Degree = 1 disables parallel processing
When a planned degree exceeds the number of online CPs, it can mean that the query is not completely processor-bound, and is instead approaching the number of partitions because it is I/O-bound. In general, the more I/O-bound a query is, the closer the degree of  parallelism is to the number of partitions.
In general, the more processor-bound a  query is, the closer the degree of parallelism is to the number of online CPs,  and it can even exceed the number of CPs by one. For example, assume  that you have a processor-intensive query on a 10-partition table, and that this query  is running on a 6-way CPC. It is possible for the degree of parallelism to be up to 7 in this case.
To help DB2 determine the optimal degree of parallelism, use the utility RUNSTATS to keep your statistics current.  PLAN_TABLE shows the planned degree of parallelism in the columns ACCESS_DEGREE and JOIN_DEGREE.

DB2 features

The following features also could be useful

  • CASE statement - Replace all UNION and UNION ALL with CASE to have a better performance
  • GLOBAL TEMPORARY table -  to avoid repeated joins if applicable
  • Online REORG - It can run in parallel with application as it does not stop application program to access data while reorg is in process
DB2 CASE statement could be very useful if application is using lot of UNION and UNION ALL statements. 

Consider the following example

SELECT C1,C2,C3 FROM T1 WHERE C1 = 5 AND C2 = 10
UNION 
SELECT C1,C2,C3 FROM T1 WHERE C1=5 AND C3 = 7

Here table T1 is scanned twice, one for each SQL.

Equivalent SQL using CASE: 

SELECT DISTINCT C1,C2,C3 FROM T1 WHERE C1 =5 AND (CASE WHEN C2=10 THEN 1 
WHEN C3=7 THEN   1
ELSE 0  END) = 1;

Here table T1 is scanned ones. But all the predicates on CASE statements are stage 2 predicates. So you should be careful while using CASE.

If you have same join many times, you can do the join once and store the result in GLOBAL TEMPORARY table and later on access the table to get the join result. Be careful in using GLOBAL TEMPORARY table as it could not be recovered.

Risks

There is no GOLDEN RULE for DB2 SQL tuning. Wrong Analysis of performance Data and access method information may led to more performance overhead. While tuning SQL in test environment, the person should keep in mind that amount of data and DB2 sub-system setup are not same. Person with good knowledge of DB2 should be involved with tuning activity.