Robert's Blog

Tuesday, March 24, 2009

DB2 for z/OS: Overlooking Writes is Wrong

I recently got into an e-mail discussion about DB2 for z/OS I/O performance. The particular concern that sparked the discussion was the impact of synchronous remote disk mirroring on I/O response times ("synchronous remote disk mirroring" refers to a high-availability feature, provided by several vendors of enterprise-class disk subsystems, that keeps data on volumes at a local site in synch with volumes at another site, with the remote site typically being within 30-35 kilometers of the local site). The person who initiated the online exchange was focused on DB2 synchronous read performance, and for him I had a couple of nuggets of information: 1) synchronous remote disk mirroring will generally have little or no impact on DB2 read I/O performance, since only disk writes are replicated to the remote site; and 2) regardless of whether or not disk volumes are remotely mirrored, you don't want to base your assessment of a DB2 for z/OS subsystem's I/O performance solely on the read response times provided by your DB2 monitoring product. In the remainder of this blog post, I'm going to expand on that second point, because DB2 write I/Os do matter.

One type of DB2 write I/O that can obviously affect application performance is a write to the active log - this because it's a synchronous event with respect to commit processing (a commit operation cannot complete until associated records in the DB2 log buffers have been externalized to disk). The good news here is that you can easily check on an application's wait time due to log writes using a DB2 monitor (it's one of the fields that shows up among the "class 3 suspension" times in a DB2 monitor accounting report or online display of thread detail information). In most cases, this wait time will be quite small (it should be zero for a read-only transaction). DB2 active log data sets associated with high-volume production systems are generally placed on disk volumes fronted by non-volatile cache memory (non-volatile meaning that a battery backup will keep data that's been written to cache and not yet to spinning disk from being lost in the event of a power failure). When this is the case, the log write is considered to be complete, from a z/OS (and DB2) perspective, once the log records have been written to disk controller cache memory (they'll be asynchronously destaged to spinning disk by the disk controller at a later time). That's a very fast write. If you see that wait time due to log writes is more than a small percentage of total class 3 suspend time for an application process, it's possible that you have some device contention that needs to be resolved, perhaps by relocating some heavily-accessed data sets that might now be interfering with active log I/Os. Also, ensure that copy 1 and copy 2 of any given active log data set are not on the same disk volume (that's important for availability as well as performance).

What about the writing of updated pages to tablespaces and indexes on disk? Plenty of people think that these are a non-factor in terms of their effect on application performance. Folks think this way because the writes are deferred with respect to the actions (e.g., UPDATE, INSERT) that change the pages. The application processes that update DB2 data are not charged for the externalization of changed index and tablespace pages to disk (the DB2 database services address space, aka DBM1, bears this cost), and this leads people to suppose that "no one waits for a DB2 database write." Au contraire, mes amis. Application programs can indeed end up waiting for DB2 database writes, and this wait time is recorded in the "other write I/O" field in the "class 3 suspensions" section or a DB2 monitor accounting report or online display of thread detail information. Here's why you should not be surprised to see a non-zero value in this field: a DB2 application process cannot access a page in the buffer pool if said page is scheduled for write (i.e., if a write I/O operation that will externalize the page to disk is underway but has not yet completed). That write I/O operation could be related to DB2 checkpoint processing or to a buffer pool deferred write threshold being reached or to a DB2 data set being pseudo-closed, but in any case it can cause a delay in the execution of an application process that needs a page that is in the process of being externalized.

Now, "other write I/O" wait time will typically be a low value, since at any given time one would expect only a small percentage of the pages in the DB2 buffer pools to be scheduled for write (a tablespace or index page could of course be changed a number of times before being externalized to disk). If you do see a time for wait due to "other write I/O" that is more than a small percentage of overall class 3 suspension time for a DB2 application process, what could you do about it? For one thing, you could work to reduce contention within the disk subsystem. A good way to do that is to enlarge your DB2 buffer pool configuration. If you can't do that (perhaps you don't have enough system memory on your z/OS system to support a larger buffer pool configuration), look at redistributing buffer space among your pools (i.e., allocate more buffers to pools with lots of I/O activity, and fewer to pools with less I/O activity). Also take a look at pseudo-close activity (check the number of data sets converted from read/write to read-only in the "open/close activity" section of a DB2 monitor statistics report or online display of subsystem activity). Pseudo-close is good for quicker restart in the event of a DB2 failure, but too much pseudo-close activity can mean a lot of page externalization. Personally, I like to see a number in the low double digits per minute for data sets converted from read/write to read-only during busy times. If you see a good bit more than that, consider adjusting the values of the ZPARM parameters PCLOSEN and PCLOSET upward somewhat.

If you need to, consider moving some data sets around within the disk subsystem so as to relieve I/O "hot spots." I don't like to get into "hand placement" of DB2 data sets on disk, preferring instead to let the operating system handle this (I like to define STOGROUPs with a VOLUMES specification of '*'), but sometimes manual intervention is needed.

DB2 synchronous read wait time gets the lion's share of attention when it comes to analyzing DB2 for z/OS I/O performance, and that's as it should be - this often accounts for a big chunk of overall in-DB2 time for an application process. That said, one should recognize that DB2 application programs can end up waiting for write I/Os to complete, too. Just keep an eye on the write-related suspension times (for log writes and for database writes, keeping in mind that the latter are often labeled as "other write I/O" by DB2 monitor products), and be prepared to act if these numbers are more than a small percentage of total class 3 suspend time for an application process.


Post a Comment

Subscribe to Post Comments [Atom]

<< Home