WLM Notes
1. Goal
Definition & Performance Index (PI)
Execution
velocity goals- Are very sensitive to configuration changes
Unachievable/Unrealistic
velocity goals – ie. goal of 90
– Check velocities of SYSTEM and SYSSTC to determine highest achievable
velocities
– Smaller n-way partitions will necessitate lower velocity goals
A velocity of
99 (conceptually 100) is not physically attainable. This means that the service
class is always going to under achieve (PI > 1) so WLM is always going to
try to give it more help unless or until the PI gets over 4. At that point it
will give up trying to help and performance can get considerably worse.
Define the
Goals
Base your
definitions on measurements -- Especially execution velocity goals can’t be defined
without knowing the data
- Use periods of high utilization or
contention as a base
If you want
that WLM always keeps an eye on your service class
- A little bit high temperature isn’t
a bad thing
- Only if that doesn’t really help use CPU or Storage
Critical
- Can be avoided in many cases
Always define
goals based on measurement intervals of high system utilization or high system
contention periods. This gives you the best indication of whether the goals can
be achieved.
You should base
goals on what can actually be achieved, not what you want to happen. The goals
you set are how WLM knows whether your system is running OK, or whether it
needs to make changes. If you set goals higher than necessary, WLM moves
resources from lower importance work to higher importance work which might not
actually need the resources.
Velocity goals
should be set with a difference of at least 10 to be effective and meaningful –
Any service classes with goals closer than 10 should be evaluated to be
combined into one service class
2.
Protecting Work & CPU Management
Protecting
Work: CPU
To avoid that
lower important work may get a higher dispatch priority than higher important
work
= Define a
tight goal!- 0.9 < Performance Index < 1.2
- Ensures that a high important service class does not
become donor
- But also requires a “constant” flow of work
= Or use CPU
Critical as the last resort
CPU Critical
only protects that work from lower importance work, no protection from work at
same or higher importance, better to have the right goal
3. CPU
Critical Usage Guidelines
When to use
CPU Critical?
Use it seldom
- For 1 or 2 service classes
- At best only for importance 1
- Suited is work which doesn’t consume much CPU, which
doesn’t show high consuming CPU spikes, and which needs fast access to CPU
- Critical server address spaces
Rule: use the
similar rule of thumb for work which you would place in SYSSTC
- Less than 20% of 1 logical processor
- No high consuming CPU spikes
- Less than 5 to 10% of total system
consumption (depending on size of system)
- No high consuming CPU spikes
When running
CICS/IMS with response time goals, and CPU critical is necessary, designate
both regions and transactions as CPU critical
4.
Discretionary Work & Capping
Discretionary
work
In order to
better help discretionary work get a chance to run, certain service class
periods may become donors to discretionary work:
◦ Velocity goal
of 30 or less or response time goal > 1 minute
◦ PI <= 0.81
◦ Not part of a resource group
In some cases,
velocity goals of 31 (instead of 30) have been used to avoid becoming a donor
What to do if
you don't like discretionary goal management (capping of work)
- Define a resource group without minimum and maximum
- Put service class with a goal eligible for capping
into this resource group
- Capping will not take place !!
5. Service
Class Design & Classification
Classify
Work
Make sure that
something really executes in your service classes- Don’t define anything with less than 2% of CPU
resource consumption
- BUT: classify distinct work to distinct service
classes, for example:
- Don’t classify short living enclaves
and STCs together
- Never classify CICS/IMS transactions
together with anything else
- If you use multiple periods
- Make sure that some substantial
amount of work is really ending in those periods
- More than 3 are very often counter
productive
Not more than
25 (max 35) [actively running] service class periods with non discretionary
goals
Service classes
should only be defined for work when sufficient demand exists for them in the
system. You can measure the demand either by the service consumption or by the
amount of ending transactions during high utilization periods of this work. It
probably does not make sense to define service classes with a demand of less
than 1% of measured service units at any period or less than 1 transaction
ending per minute. It is usually better to combine this work with other similar
work
And as always,
keep number of active service class periods to a range of 25 to 35!!!. Note
that service classes with discretionary goals do not count, because they are
not managed to goals.
6. Resource
Groups
How to use
Resource Groups
Use of minimum- For medium and lower important work which need to
make some progress
- A resource group minimum supersedes
all goals and all importance
- That means the resource group
minimum comes first
To handle
runaway jobs or transactions, consider creating a “sleeper” service class.
Associate the service class with a Resource Group and specify a maximum service
unit capacity of 1. “Quiescing a batch job” on page 208 of “System Programmers
Guide to Workload Manager” manual shows you how to use this Resource Group
setting for batch and you can use this example in other cases.
7. SYSSTC /
SYSOTHER & Special Classes
Unclassified
work will default to one of two places
– Started Tasks default to SYSSTC
– All other work defaults to SYSOTHER
Recommendations
for SYSSTC
– DB2IRLM and IMS IRLM – Lock manager needs high dispatching priority in order
to let work flow properly through the system
– “Emergency”
TSO ID – Only one TSO ID should be defined to SYSSTC
• All other TSO IDs should be grouped together, no special high priority
service class for system programmers or management
Monitor the
SYSSTC service class’s CPU usage. If high importance and online systems
experience delay caused by SYSSTC, adjust SYSSTC by moving heavy CPU users to a
different service class.
Consider
creating a special high velocity, high importance TSO service class for
emergencies. If the system hangs, you can use the RESET command to assign your
TSO user ID to this class to allow you to investigate. Note that you might
still need to wait for WLM to identify that this service class is delayed and
adjust its priority. Assigning the TSO user ID to SYSSTC is an alternative.
8. CICS
& IMS Management
Management:
CICS and IMS
WLM Puts
regions which process the same set of transactions to internal service classes
WLM creates
internal service classes up to 2n-1
Only create
different service classes if the work really executes in different regions
otherwise the result is unpredictable
9. CICS
& IMS – Goals (Response Time vs Velocity)
CICS and IMS
– R.T. or Velocity Goal?
Which is the
better way to manage online work?
Remember, WLM will set dispatching priority for the region
– Need to have the CICS and IMS Regions dispatched properly
– CICS and IMS have their own internal routines to decide which to run within
their regions
– If transactions 0101 and PRD1 both run in AOR1, CICS will decide which to
dispatch, NOT Workload Manager
Velocity
Goals for CICS and IMS
Velocity goals
are acceptable for environments with only one partition, or sysplexes with
similar sized partitions
– A sysplex
with a 4-way and a 20-way may not be a good candidate
Can be used
when the nature of online transactions does not make classification of
transactions goals reasonable
– Vastly
different types of transactions would skew response time distribution data
– Two
transactions service classes in same region will get same dispatching priority
Velocity goals
do need to be monitored and may need to be adjusted during any processor
changes
– Processor
upgrades, LPAR definition changes, etc.
Response
Time Goals for CICS and IMS
3 major
advantages of response time goals
– Easier to understand and can be set to a business SLA
– Normally no need to change when environment is changed
– Can use same goal across entire parallel sysplex, regardless of individual
partition size/speed
WLM needs to
see at least three transactions complete within a 20-minute interval to get
useful statistics for a response time goal
10. CICS/IMS
Special Handling & Best Practices
Response
Time Goals for CICS and IMS
You may want to
exclude test work from being managed towards response time goals
à Too few transactions,
no constant utilization and/or no need to manage it in a sophisticated way
Execution
Velocity Goals for CICS and IMS
Make sense for test environments
à Without a constant
flow of transactions
à If test and production
runs on the same system or sysplex
à Use response time
goals for production
à Use execution velocity
goals for test
à Exempt test regions
from response time management (Option: REGION)
Problem
à CICS TOR need fast
access to CPU to get work in and out of the system
à CICS TOR have to wait
behind AORs
à
Same dispatch priorities
Solution
à De-couple TORs from
response time management but ensure that response time management remains in
tact
à Option BOTH
CICS and
IMS: Option BOTH
à What to do
à
Separate CICS TORs and AORs to different service classes
à TOR
service class
à
Set to importance 1, define a “high” execution velocity goal, CPU Critical may
be an option too
à
Exempt the TORs from being managed towards response time goals by using option
BOTH
à
This ensures that the end-to-end context for CICS transactions remains in tact
à
This ensures that CICS transactions (and the AORs) are still managed towards
response time goals
à
This maintains all reporting features for CICS
à Why does it work
à
Typically TORs (as well as IMS CTL) only consumes 5 to 10% of the CPU of all
CICS/IMS work
11.
Transaction & Address Space Behavior
Address
space classification in transaction management mode
You always need to classify the address spaces through the STC or JES
subsystem classification rules. You only use the assigned service class goal
during initialization and termination of the address space. The assigned
service class goal is also used when there is no transaction during one minute.
The CICS
address space receives resources and has a dispatching priority. A CICS
transaction runs with the resources and dispatching priority of the CICS
region. So, we recommend that you have more Application Owning Regions (AORs)
than service classes on each z/OS image to make workload management via WLM
easier. This decreases the chance of transactions with different goals
executing in the same address space. If it happens, the transactions with less
aggressive goals get a “free ride,” while WLM tries to honor the more
aggressive and more important goals.
12. Multiple
Periods & Monitoring
Notes on
Multiple Periods
Workload
Manager makes better decisions when there are more samples per service class
period
Review RMF
Workload Activity Report for service class utilization by period
If one period
of a multi-period service class is always much smaller than the other periods,
consider consolidation
For example,
typical utilization pattern of three period service class
– SCLAS Period 1 – APPL% = 71.1
– SCLAS Period 2 – APPL% = 0.37
– SCLAS Period 3 – APPL% = 138.0
In this case,
period 2 should either be combined with 1 or 3
13. Effect
of diverse mix of Trasactions in CICS/IMS regions
If there are
too many service classes for CICS or IMS with response time goals, the
management of these transactions can become a little unpredictable, because WLM
manages the regions based on the mix of transactions that runs in these
regions.
If this mix is
too diverse or even if there are just a few service classes with response time
goals but some of them with low activity, the mix can end in unpredictable
results from a management point of view.
This is one of
the reasons why you might want to consider whether the response time goals for
CICS and IMS are the best choice for your installation. Execution velocity
goals might have their downsides, but they are much easier to correlate to what
is running on the system
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.