Robert's Blog


Thursday, December 10, 2009

DB2 for z/OS and the Disk Subsystem

Earlier this week, we in the Atlanta DB2 Users Group were treated to a day with Roger Miller, the ebullient IBMer who, more than anyone else, has been the "face" of the DB2 for z/OS development organization since the product debuted more than 25 years ago (Roger spends a great deal of time in front of DB2 users -- at customer sites, at the briefing center at IBM's Silicon Valley Lab, and at user group meetings). Roger provided us with six hours of DB2 information, laced with a little bit of Shakespeare (appropriate, as Roger's presentation style certainly brings to mind the line, "All the world's a stage," from Shakespeare's As You Like It). Some of the material that I found to be particularly interesting had to do with advances in disk I/O technology over the years, and the effect of those enhancements on DB2-related I/O performance. In this post, I'll share some of that information with you, and I'll include a few of my own observations on DB2 and the disk subsystem, based on my work with DB2 for z/OS-using organizations.

Faster and faster. Roger talked about the major gains seen in mainframe processor performance over the past 10 years (the "clock rate" of the engines in IBM's top-of-the line z10 mainframe server is 4.4 GHz, versus 550 MHz for the G6 line in 1999), and he went on to point out that improvements in disk I/O performance have been just as dramatic. As I noted in a blog entry on DB2 I/O performance that I posted last year, a target of 20-30 milliseconds of wait time per synchronous read (i.e., for an on-demand read of a single 4K DB2 page into memory from the disk subsystem) was pretty much the norm well into the 1990s (this figure is usually obtained from a DB2 monitor accounting report or an online display of DB2 accounting data). Nowadays, 2 milliseconds of wait time per synchronous read is fairly common. There are a number of factors behind this order-of-magnitude improvement in disk I/O performance, none more important than the large increase in disk controller cache memory sizes since the mid-90s (32 MB of cache once seemed like a lot in the old days -- now you can get more than 300 GB of cache on a control unit), and the development of sophisticated algorithms to optimize the effective use of the cache resource (enterprise-class disk controllers run these algorithms on multiple high-performance microprocessors).

The cache impact on disk I/O performance was underscored by some numbers that Roger shared with our group: while the time required to retrieve a DB2 page from spinning disk will generally be between 4 and 8 milliseconds (a big improvement versus 1990s-era disk subsystems), a disk controller cache hit results in a wait time of 230 to 290 microseconds for a synchronous read (the low end of this range is for a system with the z10 High-Performance FICON I/O architecture, also known as zHPF).

For DB2 prefetch reads, the gains are even more impressive. A decade or so ago, reading 64 4K pages into a DB2 buffer pool in 90 milliseconds was thought to be good performance. These days, it's possible to get 64 pages into memory in 1.5 milliseconds -- a 60X improvement over the old standard.

Looking beyond cache, a fairly recent arrival on the enterprise storage scene is solid-state disk technology (SSD). Actually, SSD itself isn't all that new -- devices of this type have been in use for about 30 years. What's new is the cost-competitiveness of enterprise-class SSD systems -- still several times more expensive than a traditional spinning-disk system, on a cost-per-gigabyte basis, but close enough now to warrant serious consideration for certain performance-critical database objects that tend to be accessed in a random fashion (as versus a sequential processing pattern). Random access is the SSD sweet spot because the technology eliminates the seek time that elongates random I/O service times when pages are read from spinning disk; thus, the wait time for a DB2 synchronous read from an SSD devices will likely be about 740-840 microseconds, versus the aforementioned 4-8 milliseconds for a read from spinning disk.

64-bit addressing means that DB2 buffer pools can be way bigger than before, and that's good for performance because no I/O operation -- even one that results in a cache hit -- will come close to the speed with which a page in memory can be accessed. That said, a large-scale DB2 application accessing a multi-terabyte database is likely to drive a lot of I/O activity, even if you have a really big buffer pool configuration. Advanced disk storage and I/O subsystem technology make those page reads much less of a drag on application performance than they otherwise would be.

Disk space utilization: don't overdo it. High-performance storage systems cost plenty of money, and people who purchase the devices don't want to have a lot of unused disk capacity on the floor. Some, however, push the target space-utilization threshold too far. See, a storage system isn't just about application performance -- it's also about application availability. Fill your disk volumes too full, and you're asking for DB2 data set space allocation failures (and associated application outages). How full is too full? I can tell you that in my own experience, DB2 data set space allocation failures tend to be a problem when an organization goes for a disk space utilization rate of 90% or more (and keep in mind that we're not just talking about data set extensions related to data-load operations -- we're also talking about utility-related disk space usage for things like sort work files and online REORG shadow data sets). At the other end of the spectrum, I've seen a situation in which an organization set a 60% threshold for disk volume space utilization. This company's DBAs liked that policy a lot (it really cut down on middle-of-the-night rousting of whoever was on-call), but I can't say that I'd advocate such a target -- it seems too low from a cost-efficiency perspective.

Where do I stand? I'm pretty comfortable with a disk space utilization target in the 70-80% range. I might lean towards the lower end of that range in a DB2 Version 9 environment, as partition-level online REORG jobs will reorganize non-partitioning indexes in their entirety, thereby necessitating more shadow data set space (a potentially compensating factor would be DB2 9 index compression, which can significantly reduce disk space requirements for index data sets). Something else to keep in mind: in addition to avoiding overly-high utilization of disk space, another good practice with regard to minimizing DB2 data set space allocation failures is to let DB2 determine the amount of space that will be requested when a data set has to be extended. This capability, sometimes called sliding-scale space allocation, was introduced with DB2 for z/OS Version 8. I blogged about it a few weeks ago, and I highly recommend its usage. People who have taken advantage of this functionality have expressed great satisfaction with the results: much less DBA time spent in managing DB2 data set allocation processing, and a sharp drop in the occurrence of DB2 space allocation-related failures. Check it out.

4 Comments:

Anonymous Anonymous said...

What do you think about the various page sizes for tablespaces? I've seen a few things about index page sizes but can't find much for tablespaces except that you can choose 4K, 8K, 16K or 32K. Any guidelines for when to go with the larger sizes? Thanks!

February 10, 2010 at 7:51 AM  
Blogger Robert Catterall said...

Generally speaking, I recommend going with the smallest page size that will accommodate a table's row length. So, if a tablespace will hold a table with 1000-byte rows, I'd go for a 4K page size. If the tablespace will hold a table with 7000-byte rows, I'd go for an 8K page size. Going with the smallest page that will accommodate a table's row length will usually allow for the best utilization of buffer pool, CPU, and disk subsystem resources.

There are some other considerations around tablespace page size selection. A good reference in this regard is the DB2 for z/OS Administration Guide. The section in the Admin Guide on tablespace page sizing can be found in the DB2 Information Center at http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/topic/com.ibm.db29.doc.admin/db2z_pagesizerecommendations.htm

Hope this helps.

Robert

February 10, 2010 at 11:22 AM  
Blogger The nation said...

Keeping pagesize low may consume more storage if the row size mutiplication is not cose to page size. In some cases the IO/CPU cycles will increase. Case by case design is best way of doing it. Large sizes are good for sequential search where index is limited.

February 27, 2010 at 3:34 PM  
Blogger Robert Catterall said...

Your point is well taken. Indeed, a 4K page size is not a very good choice if the row length is, say, 2500 bytes, because then you'll get just one row in each page and the rest of the space in the page (almost 40%) will be wasted (and the same is true for a compressed tablespace if the average length of a compressed row is 2500 bytes). In that case, if the page size is 8K then only 500 bytes (a little over 6%) per page will be wasted.

This is why I began my response to the previous comment with "generally speaking." There are exceptions to almost any rule of thumb. It comes down to time you're willing (or able) to invest in optimizing resource utilization. The "case-by-case" approach is time-consuming, but it yields CPU and storage space efficiencies (and some elapsed time benefits, as well). If an investment of less analysis time (as would be the case if you went with general rules of thumb) yields resource efficiency and performance results that are "good enough" (as determined by your organization), it may be that the time saved could be invested in some other activity (maybe involving application design or new DB2 feature exploitation) that would deliver greater value to the organization. Ultimately, you do what makes sense in your situation.

February 27, 2010 at 7:27 PM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home