Robert's Blog

Wednesday, August 27, 2008

DB2 for z/OS Performance Management -- Data Warehouse vs. OLTP

Much of the consulting work I've done lately (and that I will be doing over the next several weeks) has been focused on DB2 for z/OS-based data warehouse performance tuning. As this subject is very much on my mind, I've decided to share some related thoughts in my blog.

Managing performance in a data warehouse environment is a different animal relative to the same task when the DB2 application is transactional in nature (e.g., an order processing or an online banking application), because these system types differ in several important ways:
  • SQL statement run time: In some OLTP (online transaction processing) environments, more than a thousand database-accessing transactions are executed every second, on average, during peak processing hours. Each of these transactions will typically contain multiple SQL statements, each of which will usually complete within a small fraction of a second of wall-clock time (while consuming an even smaller fraction of a second of CPU time). In a data warehouse, some report-generating SELECT statements -- particularly of the month-end or quarter-end variety -- will run for several minutes, and maybe for an hour or more.
  • Concurrent threads: In a busy OLTP system, there could be multiple hundreds (perhaps a thousand or more) of concurrently active DB2 threads. A data warehouse is more likely to have fewer concurrently active threads (fewer in-flight units of work that remain in flight for longer periods of time, as noted above).
  • The "come-from" environment: Although stored procedures are an increasingly popular choice for mainframe DB2-based OLTP applications, it remains true that a whole lot of transaction-oriented SQL gets to DB2 for z/OS through local (to DB2) subsystems such as CICS and IMS/TM. In a data warehouse environment, it is much more likely that queries will get to DB2 (and associated result sets will be returned to requesters) via DB2 Connect and the DB2 for z/OS Distributed Data Facility (aka DDF).
  • Database design: A DB2 database used for OLTP work will likely have a traditional third-normal-form (or close to it) design. Increasingly, DB2 data warehouse databases are dimensional in their design, with sets of related tables arranged logically in a star-schema fashion.
  • Update windows: For an OLTP application, there often is no window of time during which database updates are processed. Instead, database updates happen 24/7. For many data warehouses, virtually all data updates are processed during nightly -- and often massive -- ETL runs (ETL is short for extract/transform/load). Query access is often unavailable while the ETL process is running, making timely completion of said process (e.g., by 6:00 AM each weekday morning) a matter of considerable importance.
  • Query result sets: These are generally rather small in an OLTP environment -- usually less than 100 rows, and sometimes only one or two rows. For a data warehouse, result sets that feed reports may contain many thousands of rows -- perhaps hundreds of thousands or more.
  • Query complexity: SELECT statements in online transactions tend to be quite simple -- maybe one or two tables accessed, and little or no dynamic table-building or data value or data type transformation. Some SELECTs associated with a data warehouse might be more than a page long (if you print them out), often involving large-scale joins, on-the-fly table-building (via nested table or common table expressions), recursive SQL (great for navigating hierarchies of data), data value transformation (via CASE expressions), and data type transformation (via CAST specifications and/or scalar functions).
  • Response time versus throughput: Response time is, of course, not unimportant for OLTP applications, but the relatively simple nature of the transactions (plus indexes on the right DB2 table columns) often leads to sub-second run times. On top of that, a change in a transaction's elapsed time from, say, 0.85 seconds to 0.95 seconds (or the other way around) is likely to go unnoticed by system users. The focus of performance tuning efforts is often throughput: the number of transactions that can be run through the system in a given period of time. For a data warehouse, the aim of performance tuning tends to be reduced run times for individual queries (from an hour to under 10 minutes, for example, or from 10 minutes to less than one minute).
  • SQL you have to deal with, versus SQL you wrote (or at least reviewed and approved): In a data warehouse environment, SQL might be generated by a reporting or OLAP tool (OLAP stands for online analytical processing), and a DB2 DBA might not be able to change it. Result: it has to be dealt with as-is. Furthermore, some SELECTs may be relatively ad-hoc in nature -- either generated in real-time by a query tool or built dynamically by client-side code based on the selection by a user of some combination of presented search criteria (along with ORDER BY and GROUP BY options). Not every column can be indexed, and it's not always easy to anticipate the predicates that will come in with queries. In the OLTP world, on the other hand, application code reviews can ferret out poorly-written SQL statements (which can then be recoded) as well as providing information that can lead to smart indexing decisions.
Although DB2 for z/OS data warehouse performance monitoring and tuning is largely SQL statement-centered, system performance optimization is still important (a well-tuned SELECT might still run poorly if the overall DB2 system is hobbled by a resource bottleneck). I always analyze system performance when I'm working with a DB2 for z/OS-based data warehouse. I'll list here a few of the system-related things that I like to look at, and in a subsequent post (probably sometime next week) I'll write about SQL statement analysis and tuning. So, some key system-level metrics and checklist items:
  • Memory -- In a previous post, I encouraged people to leverage the 64-bit real and virtual storage addressing capability that is relatively new on the DB2 scene (introduced with DB2 for z/OS V8). This is especially important for data warehouse workloads, which tend to be I/O intensive (lots of large index and table scans). Bigger bufferpools are almost always better, but be smart about growing them (also covered in the cited post) and know when they're big enough (I might consider putting the brakes on buffer pool enlargement if the rate of demand paging to auxiliary storage for the z/OS LPAR on which DB2 is running gets into the low single digits per second).
  • CPU efficiency -- Also in a previous post, I touted the CPU efficiency gains that could be realized through page-fixing DB2 buffer pools. This is a pretty easy change that can yield a decent return (not spectacular, but not bad, either). Also of major importance is the "page-look efficiency" of a DB2 for z/OS-based data warehouse system. By this I mean the ratio of GETPAGEs (requests by DB2 to look at index and tablespace pages) to the number of rows that are returned to requesters. Bring that ratio down (often through physical database design changes such as adding or altering indexes, re-clustering data rows, creating materialized query tables, etc.), and you've improved the CPU efficiency of your data warehouse workload (the CPU consumption of a DB2 workload is very much dependent on the level of GETPAGE activity in the system).
  • Query parallelism -- If your data warehouse system has some CPU headroom (i.e., if it's not consistently running at a utilization rate of 90% or more), consider trading some of the CPU "white space" for improved response time -- especially for queries that scan LOTS of tablespace and/or index pages -- through query CPU parallelism (if you're not already using this DB2 feature). It's been around since Version 4 of the mainframe DB2 product (and so is very much field-tested), and it can work across multiple DB2 subsystems in a parallel sysplex if you're running DB2 in data sharing mode. Concerned about one query splitting a lot and swamping your system? Don't be. First of all, you can put a cap on the degree to which DB2 will parallelize a query (you could limit this degree to 5 or 10, for example). Second, I don't know of any operating system that is the equal of z/OS when it comes to dynamically shifting processor resources among tasks in response to workload shifts. If a query splits 20 ways and is taking up a large chunk of CPU capacity in a z/OS LPAR that otherwise would not be very busy, so what? That's what parallelism is about. If some work does come into the system while that 20-way-split query is humming along, z/OS will throttle back resources allocated to the split query and reallocate them to the newly-arrived tasks. And, it does this very quickly. So don't be afraid to split.
As mentioned, I'll cover SELECT statement analysis and tuning in my next post (or very soon thereafter, if I feel compelled to blog about something else between now and then).

Thursday, August 14, 2008

Some Interesting DB2 for z/OS Data Sharing Trends and Issues

Once again, I've let more time than usual pass since the last post to my blog, the cause (again) being a period of particular busyness on my part.

I've just finished teaching a DB2 data sharing implementation class for a large company in the health-care industry. The experience had me thinking a lot about the way data sharing worked, and how organizations used it, back when the functionality was introduced with DB2 Version 4 for OS/390 (predecessor to z/OS). I'll share some of those thoughts in this entry.

Prime motivation: from capacity to availability. When DB2 data sharing showed up in the mid-1990s, IBM had recently decided to shift from very expensive (and very powerful) bipolar chip sets in its mainframe "engines" to less expensive (and -- at first -- much slower) CMOS microprocessors (the same integrated circuit technology used in PCs and other so-called distributed systems servers). In order to support what were then the world's largest mainframe applications, big companies -- manufacturers, banks, brokerages, retailers, railroads, and package delivery firms, to name some of the represented industries -- had to harness the capacity of several of the new CMOS mainframes in a single-logical-image fashion. The mainframe cluster was (and is) called a parallel sysplex, and DB2 data sharing leveraged that shared-disk clustering architecture to enable multiple DB2 subsystems on multiple servers to concurrently access a single database in read/write mode.

Well, the CMOS-based processors in today's mainframes are way faster than the bipolar variety ever were. Not only that, but you can get way more of them in one server than you could back in the nineties -- and tons more memory, to boot: the current top of the mainframe line, the IBM z10, can be configured with up to 64 engines and 1.5 terabytes of central storage. Even for a huge DB2 data-serving workload, it's likely that few organizations would need processing capacity beyond what's available with one these bad boys. Still, companies continue to implement parallel sysplexes and DB2 data sharing. Why? Availability, my friend. With a DB2 data sharing group, planned outages for DB2 maintenance (or z/OS or hardware maintenance) can be virtually eliminated, as you can apply software or hardware fixes with ZERO database downtime. DB2 data sharing on a parallel sysplex can also greatly reduce the impact of unplanned outages. A company with which I worked had a DB2 subsystem in a data sharing group fail (something that hadn't happened in a LONG time), and system users didn't even notice the failure: database access continued via the other DB2 subsystems in the group, while the failing DB2 member was automatically (and quickly) restarted by the operating system.

The primary motivation for implementing a DB2 data sharing group these days is the quest for ultra-high availability.

Plenty of DB2 subsystems in a group, but fewer hardware "footprints." A few years ago, it was not unusual to find three or more mainframes clustered in a parallel sysplex. With the processing capacity of individual mainframe "boxes" getting to be so large (through a combination of faster engines and more engines per server), organizations increasingly opt for two-mainframe sysplexes (also contributing to the decrease in hardware "footprints" within parallel sysplex configurations: the growing use of internal coupling facilities -- running in logical partitions within mainframe servers -- versus standalone external coupling facilities). Interestingly, as the number of physical boxes in companies' sysplexes have declined in number, the number of DB2 subsystems in the data sharing groups running on these parallel sysplexes has often stayed the same or even gone up. I know of an organization that runs a nine-way DB2 data sharing group on a two-mainframe parallel sysplex. Why so many? There are several reasons:
  • Having at least two DB2 subsystems per mainframe in the sysplex allows you to fully utilize the processing capacity of each mainframe even when you have a DB2 subsystem down for maintenance (recall that more and more organizations are using DB2 data sharing to enable the application of hardware and software maintenance without the need for a maintenance "window."
  • When a DB2 subsystem in a data sharing group fails, the X-type locks (aka modify locks) held by that subsystem at the time of the failure are retained until that subsystem can be restarted (usually automatically, either in-place or on another server in the parallel sysplex) to
    free them up (this is done to protect the integrity of the database). If the data sharing group has more members, the number of retained locks held by a given member in the event of a failure is likely to be smaller, reducing the impact of the failure event. Additionally, having the same workload spread across more members could speed up restart time for a failed member, as there might be somewhat less data change roll-forward and rollback work to do during restart.
  • Having more members in a data sharing group reduces the log-write load per member, as each member writes log records only for changes made by programs that execute on the member.
  • The cost of going from n-way to n+1-way data sharing (once you've gone to 2-way data sharing) is VERY small, so the overhead cost of having more DB2 subsystems in a data sharing group is typically pretty insignificant.
Binding programs with RELEASE(DEALLOCATE) is not the data sharing recommendation it once was. Prior to DB2 for z/OS Version 8, XES (cross-system extended services, the component of the operating system that handles interaction with coupling facility structures such as the global lock structure) would perceive that an IS lock on tablespace XYZ held by a program running on DB2A is incompatible with an IX lock requested for the same tablespace by a program running on DB2 B, when in fact the two locks are compatible. The local lock managers associated with DB2A and DB2B (i.e., the IRLMs) would figure out that the two locks are not in conflict with each other, but only after some inter-system communication that drove up overhead. In order to avoid incidences of such perceived-but-not-real global lock contention (called XES contention), people would bind programs with RELEASE(DEALLOCATE) to have them retain tablespace locks across commits .In a related move, people would seek to use more transactional threads that last through commits, such as CICS protected entry threads (batch threads automatically persist through commits, deallocating only at end-of-job).

DB2 Version 8 introduced a clever new global locking protocol that eliminates perceived IX-IS and IX-IX inter-system tablespace lock contention. One effect of this development is that the decision on whether to bind programs with RELEASE(DEALLOCATE) or RELEASE(COMMIT) is now pretty much unrelated to data sharing. Do what you'd do (or as you've done) in a non-data sharing DB2 environment.

The "double-failure" scenario is not so scary anymore. What I'm referring to here is the situation that can arise if the global lock structure used by a data sharing group is located in an internal coupling facility (ICF) that is part of a mainframe on which at least one DB2 member of the same data sharing group is running. In that case, if the whole box (the mainframe server) fails, both the lock structure and a related DB2 subsystem fail. In that case, the lock structure can't be rebuilt because the rebuild process needs information from all of the DB2 members in the group, and -- as mentioned -- at least one of those DB2 subsystems failed when the mainframe failed. Without a lock structure, the whole group will fail, and a group restart will be necessary to get everything back again.

Nowadays, with (as pointed out earlier) many organizations having fewer (though much more powerful) mainframe servers than they'd had some years ago, and with many companies strongly preferring to use ICFs (they are attractively priced versus standalone external CFs), folks are wanting to implement two-mainframe parallel sysplexes with an ICF in each mainframe. That set-up makes the "double-failure" scenario a possibility. Know what I say to folks leaning towards this configuration? I tell them to go ahead and to not sweat a "double failure," because 1) it's exceedingly unlikely (remember, the whole mainframe server would have to fail, and on top of that, it'd have to be the one with the ICF holding the lock structure), 2) even if it does happen, the group restart will clean everything up so that no committed database changes are lost, 3) group restart is a better-performing process than it was in the earlier years of data sharing, when processors were slower and the recovery code was less sophisticated than it is in current versions of DB2, and 4) given items 1-3 in this list, I can understand why organizations would balk at paying extra -- either in the form of an external CF or the overhead of system duplexing of the lock structure -- to avoid a situation that has a very low probability of occurrence and that does not (in my opinion) constitute a "disaster" even if it does occur.

So, the DB2 data sharing landscape has changed since DB2 Version 4. Here's something that hasn't changed: DB2 data sharing on a parallel sysplex is the most highly available and scalable data-serving platform on the market. It's great technology that just keeps getting better.