DB2 for z/OS Performance Management -- Data Warehouse vs. OLTP
Much of the consulting work I've done lately (and that I will be doing over the next several weeks) has been focused on DB2 for z/OS-based data warehouse performance tuning. As this subject is very much on my mind, I've decided to share some related thoughts in my blog.
Managing performance in a data warehouse environment is a different animal relative to the same task when the DB2 application is transactional in nature (e.g., an order processing or an online banking application), because these system types differ in several important ways:
Managing performance in a data warehouse environment is a different animal relative to the same task when the DB2 application is transactional in nature (e.g., an order processing or an online banking application), because these system types differ in several important ways:
- SQL statement run time: In some OLTP (online transaction processing) environments, more than a thousand database-accessing transactions are executed every second, on average, during peak processing hours. Each of these transactions will typically contain multiple SQL statements, each of which will usually complete within a small fraction of a second of wall-clock time (while consuming an even smaller fraction of a second of CPU time). In a data warehouse, some report-generating SELECT statements -- particularly of the month-end or quarter-end variety -- will run for several minutes, and maybe for an hour or more.
- Concurrent threads: In a busy OLTP system, there could be multiple hundreds (perhaps a thousand or more) of concurrently active DB2 threads. A data warehouse is more likely to have fewer concurrently active threads (fewer in-flight units of work that remain in flight for longer periods of time, as noted above).
- The "come-from" environment: Although stored procedures are an increasingly popular choice for mainframe DB2-based OLTP applications, it remains true that a whole lot of transaction-oriented SQL gets to DB2 for z/OS through local (to DB2) subsystems such as CICS and IMS/TM. In a data warehouse environment, it is much more likely that queries will get to DB2 (and associated result sets will be returned to requesters) via DB2 Connect and the DB2 for z/OS Distributed Data Facility (aka DDF).
- Database design: A DB2 database used for OLTP work will likely have a traditional third-normal-form (or close to it) design. Increasingly, DB2 data warehouse databases are dimensional in their design, with sets of related tables arranged logically in a star-schema fashion.
- Update windows: For an OLTP application, there often is no window of time during which database updates are processed. Instead, database updates happen 24/7. For many data warehouses, virtually all data updates are processed during nightly -- and often massive -- ETL runs (ETL is short for extract/transform/load). Query access is often unavailable while the ETL process is running, making timely completion of said process (e.g., by 6:00 AM each weekday morning) a matter of considerable importance.
- Query result sets: These are generally rather small in an OLTP environment -- usually less than 100 rows, and sometimes only one or two rows. For a data warehouse, result sets that feed reports may contain many thousands of rows -- perhaps hundreds of thousands or more.
- Query complexity: SELECT statements in online transactions tend to be quite simple -- maybe one or two tables accessed, and little or no dynamic table-building or data value or data type transformation. Some SELECTs associated with a data warehouse might be more than a page long (if you print them out), often involving large-scale joins, on-the-fly table-building (via nested table or common table expressions), recursive SQL (great for navigating hierarchies of data), data value transformation (via CASE expressions), and data type transformation (via CAST specifications and/or scalar functions).
- Response time versus throughput: Response time is, of course, not unimportant for OLTP applications, but the relatively simple nature of the transactions (plus indexes on the right DB2 table columns) often leads to sub-second run times. On top of that, a change in a transaction's elapsed time from, say, 0.85 seconds to 0.95 seconds (or the other way around) is likely to go unnoticed by system users. The focus of performance tuning efforts is often throughput: the number of transactions that can be run through the system in a given period of time. For a data warehouse, the aim of performance tuning tends to be reduced run times for individual queries (from an hour to under 10 minutes, for example, or from 10 minutes to less than one minute).
- SQL you have to deal with, versus SQL you wrote (or at least reviewed and approved): In a data warehouse environment, SQL might be generated by a reporting or OLAP tool (OLAP stands for online analytical processing), and a DB2 DBA might not be able to change it. Result: it has to be dealt with as-is. Furthermore, some SELECTs may be relatively ad-hoc in nature -- either generated in real-time by a query tool or built dynamically by client-side code based on the selection by a user of some combination of presented search criteria (along with ORDER BY and GROUP BY options). Not every column can be indexed, and it's not always easy to anticipate the predicates that will come in with queries. In the OLTP world, on the other hand, application code reviews can ferret out poorly-written SQL statements (which can then be recoded) as well as providing information that can lead to smart indexing decisions.
- Memory -- In a previous post, I encouraged people to leverage the 64-bit real and virtual storage addressing capability that is relatively new on the DB2 scene (introduced with DB2 for z/OS V8). This is especially important for data warehouse workloads, which tend to be I/O intensive (lots of large index and table scans). Bigger bufferpools are almost always better, but be smart about growing them (also covered in the cited post) and know when they're big enough (I might consider putting the brakes on buffer pool enlargement if the rate of demand paging to auxiliary storage for the z/OS LPAR on which DB2 is running gets into the low single digits per second).
- CPU efficiency -- Also in a previous post, I touted the CPU efficiency gains that could be realized through page-fixing DB2 buffer pools. This is a pretty easy change that can yield a decent return (not spectacular, but not bad, either). Also of major importance is the "page-look efficiency" of a DB2 for z/OS-based data warehouse system. By this I mean the ratio of GETPAGEs (requests by DB2 to look at index and tablespace pages) to the number of rows that are returned to requesters. Bring that ratio down (often through physical database design changes such as adding or altering indexes, re-clustering data rows, creating materialized query tables, etc.), and you've improved the CPU efficiency of your data warehouse workload (the CPU consumption of a DB2 workload is very much dependent on the level of GETPAGE activity in the system).
- Query parallelism -- If your data warehouse system has some CPU headroom (i.e., if it's not consistently running at a utilization rate of 90% or more), consider trading some of the CPU "white space" for improved response time -- especially for queries that scan LOTS of tablespace and/or index pages -- through query CPU parallelism (if you're not already using this DB2 feature). It's been around since Version 4 of the mainframe DB2 product (and so is very much field-tested), and it can work across multiple DB2 subsystems in a parallel sysplex if you're running DB2 in data sharing mode. Concerned about one query splitting a lot and swamping your system? Don't be. First of all, you can put a cap on the degree to which DB2 will parallelize a query (you could limit this degree to 5 or 10, for example). Second, I don't know of any operating system that is the equal of z/OS when it comes to dynamically shifting processor resources among tasks in response to workload shifts. If a query splits 20 ways and is taking up a large chunk of CPU capacity in a z/OS LPAR that otherwise would not be very busy, so what? That's what parallelism is about. If some work does come into the system while that 20-way-split query is humming along, z/OS will throttle back resources allocated to the split query and reallocate them to the newly-arrived tasks. And, it does this very quickly. So don't be afraid to split.