Catterall Consulting: August 2009

I recently had the opportunity to work on an interesting challenge, involving the use of one DB2 system for both an online transaction processing (OLTP) and a business intelligence (BI) workload. Now, the system in question happened to be a DB2 data sharing group running on a parallel sysplex, and that opens up some interesting possibilities, but before getting to that I'd like to deal with the overall issue: can you successfully support OLTP and BI applications using the same mainframe DB2 system?

Some people would be pretty quick to answer that question with a negative, and the chief concern of those folks might be the unfavorable impact of BI queries on the performance of online transactions. Is that a valid concern? It certainly can be, depending on the extent to which one can or cannot eliminate areas of contention between the two workloads. Put another way, if OLTP- and BI-related SQL statements can execute without interfering with one another, use of one DB2 system for both workloads is a viable option. In this post, I'm going to briefly describe some sources of single-system OLTP and BI workload contention, and ways in which such contention can be either eliminated or at least largely mitigated. In a Part 2 entry that will follow in a few days, I'll zero in on the challenge of managing query parallelization in a single OLTP- and BI-supporting DB2 environment. On now to potential sources of OLTP-BI interference:

The disk subsystem. This, of course, has very much to do with whether or not OLTP- and BI-related SQL statements are targeting the same DB2 tables. You might be thinking, "Well of course they wouldn't - you'd have one set of tables for OLTP programs and another set for the query and reporting applications." I'll agree that this is the PREFERRED scenario versus having OLTP and BI users access the same tables. I'd certainly want to place the OLTP database objects and the BI objects on two different sets of disk volumes. That said, some mainframe DB2 shops do, in fact, use the same tables for OLTP and for query/reporting. It's not that they're crazy. They may have had to give query and reporting capabilities to a group of users in a quick and inexpensive way, and allowing online access to the OLTP tables might have been seen as the best near-term alternative. Working in these folks' favor is the fact that disk-level contention, while definitely not a non-issue, is not as severe a performance concern as it was back in the 1990s, before people had gigabytes of cache memory on their disk control units and before DB2 and System z had 64-bit addressing (buffer pools can be MUCH larger in DB2 V8 and V9 subsystems versus prior releases). More DB2 buffer pool and storage controller cache hits mean fewer accesses to spinning disk, and that means less contention. Still, if you use the same tables for OLTP and BI, keep an eye (via your DB2 monitor) on wait time per DB2 synchronous read I/O. I like to see a figure that's below 10 milliseconds.

Note that reduced disk contention is only one of the benefits associated with the use of separate tables for OLTP and BI applications. Physical table separation also provides an opportunity to make the BI tables more appropriate for a data warehouse environment, perhaps through database design changes (e.g., normalization/denormalization, or maybe a change to a dimensional design featuring so-called star schema table arrangements) and/or data transformations that might include replacement of codes with more user-meaningful data values.

The DB2 buffer pools. As with disk-level contention, reducing contention for buffer pool resources depends largely on the use of separate database objects for OLTP and BI purposes. Different tablespaces and indexes can be assigned to different buffer pools. If OLTP and BI applications have to use the same database objects (and I would hope that this would be a temporary set-up on the way towards physical separation), you will be primarily focused on keeping BI-related prefetch activity from pushing table and index pages accessed by OLTP programs out of the shared buffer pools. To this end, you might want to adjust VPSEQT (the threshold for the percentage of buffer pool space that can hold pages brought in via prefetch) down somewhat from the default of 80 for shared buffer pools. If you use query parallelism, you might also want to alter VPPSEQT (similar in concept to VPSEQT, but for pages brought in via queries that are split and executed in a parallel fashion) down somewhat from the default of 50 for shared pools.

In protecting OLTP transaction performance in a shared buffer pool environment, look to keep the rate of disk read I/Os at a reasonable level. I like it when the combined number of synchronous and asynchronous - i.e., prefetch - read I/Os is less than 100 per second. If the read I/O rate is high, you can generally reduce it by enlarging the size of the buffer pool in question (but don't over-burden server memory - the rate of demand paging from auxiliary storage, available via your z/OS monitor, should ideally be in the single digits or low double digits per second).

Locking. Once again, the best result is achieved through the use of separate tables for OLTP and BI. If dual-purpose tables are a necessity for a while, consider having the BI queries run with the UR (uncommitted read) isolation level (this can be accomplished by binding packages through which BI queries are executed with ISOLATION(UR), or by specifying WITH UR at the end of SELECT statements). UR results in essentially no locking. That, of course, means that queries could read uncommitted data changes that are subsequently rolled back because of an error situation encountered by a data-changing program, but this may not be a concern for BI users. BI queries often involve large-scale data aggregations, so a result obtained by a UR query is likely to be very close (if not identical) to one obtained with the default isolation level of cursor stability in effect, and very close is typically good enough for decision support applications.

If you can use separate tables for BI applications, put these (and their associated indexes) in databases other than those used for the OLTP tables (and here I'm using the term "database" in the very specific DB2 for z/OS sense). Databases have associated with them control blocks called database descriptors, or DBDs, and the S-lock taken on a DBD by a dynamic SQL statement can conflict with data definition language statements (e.g., CREATE, ALTER, and DROP) targeting objects in the same database. If you have to use one set of tables for OLTP and BI, you can reduce DBD locks associated with dynamic SQL statements (and BI-related queries tend to be dynamic) by taking advantage of DB2's dynamic statement caching capability (dynamic statement caching can also provide you with a wealth of information about your dynamic SQL workload, this by way of the SQL statement EXPLAIN STMTCACHE ALL).

Long-running dynamic queries can also interfere with the drain locking mechanism used by DB2 for z/OS utilities such as online REORG. A running SELECT statement will cause a read claim to be acquired and held by the issuing process, and that claim will be held until the process commits. In a BI environment, commits are typically taken - often automatically by a query tool - for each statement after it completes, but the read claim can be held for a long time in the case of a long-running query, and some BI queries can run for hours. If the same tables have to be used for OLTP and BI purposes, you might need to shut off query submission at some time (e.g., 7:00 PM), and then allow two or three hours for any long-running queries to complete. After that time you may need to cancel threads associated with still-executing BI queries so that utilities targeting dual-purpose tablespaces and indexes will be able to run unimpeded until BI activity starts up again at the beginning of the business day.

CPU. Ideally, there will be enough CPU capacity on the DB2 system to satisfy the requirements of both the OLTP and BI workloads. If there is not enough CPU to go around, you can try to free some up through tuning the performance of OLTP- and/or BI-related SQL statements; otherwise, you may need to adjust z/OS Workload Manager (WLM) settings to favor one workload over the other, or perhaps to give priority to SQL statements that run quickly (weather OLTP- or BI-related) versus those that take longer to complete.

Organizations running DB2 in data sharing mode on a parallel sysplex have an interesting option that's not available in a standalone DB2 environment: they can use one or more members of the data sharing group exclusively for BI work, while other members process the data requests of OLTP programs. This arrangement can virtually eliminate OLTP-BI workload contention with respect to server memory and server processing capacity. Combine that with the use of separate tables for OLTP and BI, and voila: you have the two different workloads executing on the same DB2 system (the data sharing group) in a most non-contentious way.

Whether you're using DB2 in standalone or in data sharing mode, an important aspect of managing CPU contention between OLTP and BI applications running on the same system is the management of DB2 query parallelization, particularly as it pertains to the BI queries. I'll cover that topic in the part 2 companion to this blog entry, which I'll post within the next few days. As we say here in Georgia, y'all come back.

I recently fielded a question, from a DB2 person who works for an insurance company, about the performance implications of defining large parameters for DB2 for z/OS stored procedures. At this company, DB2 stored procedures written in COBOL had been created with PARAMETER STYLE GENERAL. Now, for the first time, a stored procedure with really large parameters (VARCHAR(32700), to be precise) was to be developed, and the DB2 person was concerned that the entire 32,700 bytes of such a parameter would be passed between the calling program and the stored procedure, even when the actual value of the parameter was an empty string (i.e., a string of length zero). He was thinking that the only way to avoid passing 32,700 bytes to a called stored procedure in cases when no value was to be assigned to a parameter would be to create the stored procedure with PARAMETER STYLE GENERAL WITH NULLS, and have the calling program set the value of the parameter to NULL by way of a null indicator.

I'll tell you that I had been under this same impression for some time, based on my reading of material in various DB2 manuals; however, it turns out that passing an unnecessarily large number of bytes between a calling program and a stored procedure is not a concern when the stored procedure is defined with PARAMETER STYLE GENERAL. I like the way that Tom Miller, a stored procedures expert with IBM's DB2 for z/OS development organization, explains this. He pictures a VARCHAR data type as being a box, in which you have two bytes for a length indicator and then N bytes of what you could call the defined data area. When a program calls a stored procedure that has a VARCHAR-defined parameter, only "active" data in the parameter is passed to DB2. The "inactive" portion of the VARCHAR "box" - that being the part of the defined data area past the current data length (as specified in the 2-byte length portion of the "box") - is undefined and therefore ignored. So, if a program sets the length of a VARCHAR-defined parameter to zero and then calls the associated stored procedure, only an empty string will be passed to DB2 (in other words, the entire VARCHAR "box" will NOT be passed).

This being the case, the choice of PARAMETER STYLE GENERAL or PARAMETER STYLE GENERAL WITH NULLS comes down to addressing the needs of the application developer. PARAMETER STYLE GENERAL WITH NULLS can result in slightly higher overhead versus PARAMETER STYLE GENERAL (it costs a bit more to pass a null value than it does to pass a zero-length VARCHAR), but PARAMETER STYLE GENERAL WITH NULLS can accommodate the full range of possibilities with respect to the values that can be passed with parameters, including the null value. In this light, PARAMETER STYLE GENERAL can be seen as an application simplification choice for programs that don't need to deal with null values.

I have to tip my hat to the DB2 person who sent this question to me, because before I was able to get back to him with a response, he had already executed a test that demonstrated the actual behavior, in terms of bytes passed, for a COBOL stored procedure created with a VARCHAR parameter (specified as INOUT, meaning that it would be used as both an input and an output parameter) and PARAMETER STYLE GENERAL (the calling program, in the case of this test, was another COBOL program, and it invoked the stored procedure by way of an SQL CALL - NOT a COBOL subroutine CALL). The calling program set the VARCHAR parameter to a long string of G's, then set the length of the VARCHAR to zero and called the stored procedure. The stored procedure displayed the length of the parameter and its contents, then set the value of the parameter to a long string of asterisks, set the length of the parameter to 5, and returned to the calling program. The calling program then displayed the length and contents of the parameter. Sure enough, the stored procedure displayed a parameter length of 0 (as set by the calling program) and no content, and the calling program (following return from the stored procedure) displayed a parameter length of 5 (as set by the stored procedure) and content consisting of 5 asterisks followed by a lot of G's (the 5 asterisks returned by the stored procedure overlaid the first 5 of the G's that had initially been moved to the VARCHAR "box" in the calling program).

[Note that there is a caveat with respect to parameters in a REXX program. These never arrive with any undefined portion (with any "air", as the previously cited Tom Miller of IBM puts it), as there is no concept of that in REXX variables.]

I hope that this information will be helpful to your own DB2 stored procedure development efforts. As I've mentioned in other posts to this blog, I'm very big on DB2 stored procedures, and I want people to use them effectively and successfully.

Previous Posts

Archives

Wednesday, August 26, 2009

OLTP and BI on the Same DB2 for z/OS System (Part 1)

Friday, August 14, 2009

DB2 for z/OS Stored Procedures and Large Parameters