Let’s
begin with what happened before we could create a package. You first
needed a Database Request Module (DBRM), which is the input to the
bind process and is created as one of the DB2 precompiler or SQL
coprocessor outputs. However, to discuss the DBRM, we need to go back
even further—to the first step in the program preparation process:
the DB2 precompiler.
If
your program is written in C, C++, COBOL, or PL/I, regardless of
whether it uses static or dynamic SQL, that program must be
“prepared” before you can execute it. The first step is
determining how to handle the SQL. Many languages may not be ready to
handle an SQL statement. In recent years, the option has been made
available to use the SQL coprocessor in the compile step rather than
running the program code through the DB2 pre-compiler. Both processes
have the same effect: modified source and a DBRM.
Since
DB2’s beginning, a method has been needed to “remove” SQL
before the compiler could actually process source statements. The
compiler had no idea how to process an SQL statement. The DB2
precompiler would replace the SQL statements with a CALL statement to
the appropriate language interface in a modified source file. The
actual SQL statement, in an internal format, would be added to a
separate object, the DBRM. The modified source would serve as input
to the compiler, and the DBRM as input to the bind.
Now,
if you have the correct compiler release levels, there’s an SQL
compiler option that lets the compile invoke the SQL coprocessor.
This completely eliminates the precompiler step and lets the compiler
“handle” an SQL statement and produce a DBRM. Both methods are
acceptable and both create DBRMs. From here on, though, we’ll refer
only to the precompiler step. The actual compiler step is executed,
and an object module is produced as output of the compiler and passed
to the linkage editor, which creates the executable representation of
the application program, the load module. Now, let’s get back to
that DBRM.
A
DBRM is one of the outputs of the precompiler (or coprocessor) and
contains all the program’s SQL in an internal format. The DBRM will eventually be input to the
bind process. Bind creates the executable form of the SQL, similar to
what the linkage editor does for the program. DBRMs can be bound into
plans or packages. Our focus is on packages because no one would
intentionally bind a DBRM directly into a plan anymore. There just
isn’t any reason for that today.
Packages
A
package is an executable representation of a single DBRM. Before the
introduction of packages, the DBRMs representing the programs linked
into a single load module were all bound into a single plan. Changes
made to a single DBRM meant the entire plan had to be bound. If the
DBRM was used in more than one plan, all plans containing that DBRM
had to be bound. Depending on the number of DBRMs included in a plan,
this could be a lengthy, disruptive process.
With
packages, though, the process is simpler and offers some built-in
fallback. Each package represents a single bound DBRM. A package is a
direct, one-to-one relationship between an application program, the
DBRM created by that application and used as input to the package
BIND, and the eventual package itself. Even the package name is the
same as the DBRM used to create it. A package name can’t even be
specified at BIND time because there’s no package-name keyword on
the BIND package statement. The package always assumes the name of
the DBRM specified by the member or copy keyword. When a program is
changed, creating a new DBRM, only that single package needs to be
bound. At the next program execution, if versions aren’t being
used, the plan will pick up the new package. No massive plan bind is
needed.
You
can’t specify a package name at bind time, but must specify a
collection-id. A collection-id is a way of associating similar
packages. However, a collection won’t exist on its own. It has
meaning only when it’s used to identify a package. For example, a
collection could contain all the payroll packages. To reference a
specific package, PAYROLL.packagename would have to be coded, where
PAYROLL is the collection-id. One advantage of using collections is
referencing multiple packages from the same collection. A package
list could simply be coded as PAYROLL.* rather than PAYROLL.PACKAGE1,
PAYROLL.PACKAGE2, etc. In fact, if all packages used by this plan
belong to the PAYROLL collection, PAYROLL.* would be the
best-performing choice.
Another
advantage of packages is versioning, the ability to keep multiple
copies of the same package. You can maintain multiple versions by
using package versions or by placing the same package name in
multiple collections. A package version-id, if used, is created using
the optional precompiler keyword, VERSION. A version can have a
maximum length of 64 bytes and can be a user-defined string or
generated automatically using the consistency token. If not
specified, a blank string is used.
The
combination of package name and version-id makes the package unique.
By changing the version-id each time a program is precompiled, DB2
can have multiple copies of the same package available, each at a
different “version.” By changing which load module is executed
alters which package DB2 will reference. This same effect can also be
accomplished by binding different copies of the same package into
different collections. This method relies on manually moving packages
from one collection to another, however, or changing the SET CURRENT
PACKAGESET value. If you want to take advantage of versions, ensure
you have a process to clean up older versions of packages no longer
in use. Allowing the uncontrolled, unmanaged creation of multiple
packages with different versions only increases the size of your
catalog tables and will eventually become a management problem, if
not a performance issue.
One
difference between plans and packages is their use of the EXECUTE
privilege. This can sometimes be confusing for the package user
because DB2 has three different definitions, or implementations, for
the same privilege keyword. When the EXECUTE privilege is granted on
a package, the privilege allows only the authid to include that
package in the package list of the plan being bound. If that same
EXECUTE privilege is granted on a plan, the authid granted the
privilege can execute that plan. In DB2 for z/OS, only plans can be
executed. A package can’t normally be explicitly accessed except
through a plan. Even packages bound on DB2 for z/OS from a
workstation, where there’s no concept of a plan, still run under a
special DB2 system plan called DISTSERV. Finally, DB2 allows the
EXECUTE privilege to be granted over an entire collection. Granting
EXECUTE in this fashion allows the authid to include any package in
that collection into a plan’s package list and allows the inclusion
of any packages added to this collection in the future.
The
Search
The
key to understanding how packages will affect your applications is to
understand how DB2 decides which package in the package list it
should be using. Whatever the connection type, a program always calls
the DB2 language interface. This CALL statement is added to the
source code by the DB2 precompiler replacing the corresponding SQL
statement. A modified source file, with the SQL commented out and its
new call statement added, is compiled and linked. The original SQL is
extracted and placed into a DBRM and then used as input to the
package bind discussed earlier.
The
CALL statement’s parameter list contains the DBRM name, consistency
token, SQL section number, and SQL statement number. The DBRM name is
the name of the program containing the SQL statement. This also is
the program name. The DBRM name is used to decide which package in
the package list should be loaded when the program runs.
The
consistency token is a unique identifier used to ensure that a load
module uses the correct plan, package, or version of a package.
Although the consistency token is sometimes referred to as a
timestamp, it’s not. DB2 creates the consistency token from an MVS
ISO timestamp, usually STCK format. However, that’s where the
similarity ends. From there, DB2 manipulates the value into something
called CONTOKEN.
Specifying
the LEVEL keyword at precompile time can override the use of the
timestamp when building the consistency token. When LEVEL is used,
the value supplied by the LEVEL keyword is used as the consistency
token instead of a timestamp. Users have been cautioned about using
the LEVEL keyword. With packages, there’s yet another reason to
avoid this option. When LEVEL is specified, and VERSION is
automatically generated, bind will use the LEVEL as the VERSION. This
defeats the purpose of using VERSION, which implies the use of
multiple copies of the same package (while LEVEL implies reuse).
The
SQL statement number in the parameter list corresponds to the state
ment
number of the modified call statement in the precompiler output
listing PCLIST.LISTING. This number has nothing to do with the
statement numbers in the original source listing. For now, we’re
concerned only with the DBRM name and consistency token and how
they’re used when passed to DB2. These are the only two pieces of
information from the program usually used to find the correct
DBRM/package in the plan.
Every
plan contains a plan header, sometimes called a header page or
section 0. This section contains a directory of all DBRMs bound
directly into a plan followed by a list of all packages this plan can
use. Using the DBRM name passed to DB2 from the call statement’s
parameter list, the DBRM directory in section 0 is searched first for
the existence of DBRMs bound directly into the plan. However, the
search examples used in this article assume no DBRMs were bound
directly into the plan. The search will therefore be limited to the
package list in the plan header.
The
list of packages in the plan header is searched serially. The order
of the packages is the same as originally specified in the BIND PLAN
command’s PKLIST keyword. The format of a package entry in the
package list (PKLIST) is the concatenation of location,
collection-id, and package-name, although location isn’t required.
When the DBRM name passed from the call statement matches a
package-name in the package list, a search value is built. It
consists of the padded package list entry location, collection-id,
and package-name with the consistency token from the call statement
appended. This search argument is used against the DSNSPT01 index on
the Skeleton Package Table (SKPT) to test for the package’s
existence.
The
first key column of the DSNSPT01 index is SPTPID. The three values
used to form this column follow Distributed Relational Database
Architecture (DRDA) rules. This means the location is 18 characters
rather than DB2’s 16 (128 in V8) and package-name is also 18
characters rather than DB2’s eight (128 in V8). Collection-id is 18
characters, just as it is in DB2 (again 128 in V8). Besides the
SPTPID, the index key also consists of the RDS number and the section
number. If a “row not found” condition exists, the package list
search continues with the next package that qualifies in the plan’s
package list. An error is returned to the application process (-805)
if no match is found after searching all the packages in the list.
However, when the I/O is successful, the package header, or section 0
of the package, is moved into the EDM pool and used to verify that
the SQL sections are contained in this package. When the appropriate
section is needed, that section is moved into the EDM pool if it
isn’t already present.
You
can now build other scenarios. For example, what if you have a
collection-id.*? Then any DBRM name from the call list will always
qualify. The asterisk (*) works as a wild card for all packages in
that collection. There’s no additional search necessary for that
collection. If this package list contained only the one entry,
collection-id.*, the shortest possible search is performed. If there
were multiple collection-id.* entries in the package list, there’d
still be only one search necessary for each of the collection
entries. Contrast this with a long list of collection-id.package-name
entries where the entire list must be searched until a package is
found or a not found condition occurs. But even a long explicit list
of packages doesn’t necessarily translate to poor performance. DB2
can sufficiently search a fairly long list of packages before it has
a performance impact.
For
DBRMs in a plan with or without packages, a binary search is used.
For packages, a hashing algorithm is used to check if the package has
already been allocated to the plan. If not, then the package list is
searched in the order it was specified on BIND.
The
CURRENT PACKAGESET special register affects the package search in two
ways. First, it’s required if an * is used in place of a
collection-id. A collection-id must be present for the search and an
* doesn’t qualify. Whenever an * is used in place of the
collection-id in the package list, the SET CURRENT PACKAGESET SQL
statement must be coded in the program and executed before using any
SQL in a package referenced by that collection. If not, the program
will receive a negative SQLCODE (-812) at run-time. There’s one
major drawback when using an * as the collection-id in the package
list. DB2 verifies that the plan owner has the execute privilege on
the package at run-time, not at plan bind. Performing authorization
table lookups at run-time can be costly.
Second,
when CURRENT PACKAGESET special register is set to a non-blank
string, it limits the package search to the collection-id specified
in CURRENT PACKAGESET. This can be useful when multiple collections
are specified in the package list and the application wants to limit
how much searching is performed. Changing the value of the CURRENT
PACKAGESET special register is also a handy technique used to change
which package is being referenced at run-time.
The
version-id specified at precompiler time plays no direct part in this
search. Only the DBRM name and consistency token are used. The
version-id allows only multiple rows to be inserted into the package
catalog tables for this package. Remember there’s always a
one-to-one correlation between a version and consistency token.
Because a version can only be created by the pre-compiler, every
version will have a unique consistency token.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.