Robert's Blog

Friday, March 5, 2010

Another Note on DB2 for z/OS Buffer Pool Page-Fixing

In the summer of 2008, I posted a blog entry on page-fixing DB2 buffer pools, a feature introduced with DB2 for z/OS Version 8. A recent discussion I had with a client about buffer pool page-fixing brought to light two aspects of this performance tuning option that, I believe, are overlooked by some DB2 users. In this post I'll describe how you can make a quick initial assessment as to whether or not the memory resource of a mainframe system is sufficient to support buffer pool page-fixing, and I'll follow that with a look at the "bonus" performance impact that can be realized by buffer pool page-fixing in a DB2 data sharing environment.

Gauging the server memory situation. As pointed out in the aforementioned 2008 blog entry on the topic, page-fixing a buffer pool can reduce CPU consumption by eliminating the requests that DB2 would otherwise have to make of z/OS to fix in memory -- and to subsequently release -- a buffer for every read of a page from, or write of a page to, the disk subsystem. These page fix/page release operations are individually inexpensive, but the cumulative CPU cost can be significant when the I/Os associated with a pool number in the hundreds (or thousands) per second. The prospect of removing that portion of a DB2 workload's CPU utilization may have you thinking, "Why not?" Well, there's a reason why PGFIX(NO) is the default setting for a DB2 buffer pool, and it has to do with utilization of a mainframe server's (or z/OS LPAR's) memory resource.

With PGFIX(NO), the real storage page frames occupied by DB2 buffers are candidates for being stolen by z/OS, should the need arise. If something has to be read into memory from disk, and there is no available page frame to accommodate that read-in, z/OS will make one available by moving its contents to a page data set on auxiliary storage (if that relocated page is subsequently referenced by a process, it will be brought back into server memory from auxiliary storage -- this is known as demand paging). z/OS steals page frames according to a least-recently-used algorithm: the longer a page frame goes without being referenced, the closer it moves to the front of the steak queue. If a DB2 buffer goes a long time without being referenced, it could be paged out to auxiliary storage.

So, page-fixing a buffer pool in memory would preclude z/OS from considering the associated real storage page frames as candidates for stealing. The important question, then, is this: would some of those pages be stolen by z/OS if they weren't fixed in memory from the get-go? If so, then page-fixing that pool's buffers might not be such a great idea: in taking away some page frames that z/OS might otherwise steal, buffer pool page fixing could cause page-steal activity to increase for other subsystems and application processes in the z/OS LPAR. Not good.

Fortunately, there's a pretty easy way to get a feel for this: using either your DB2 monitor (an online display or a statistics report) or the output of the DB2 command -DISPLAY BUFFERPOOL DETAIL, look for fields labeled "PAGE-INS REQUIRED FOR READ" and "PAGE-INS REQUIRED FOR WRITE" (or something similar to that). What these fields mean: a page-in is required for a read if DB2 wants to read a page from disk into a particular buffer, and that buffer has been paged out to auxiliary storage (i.e., the page frame occupied by the buffer was stolen by z/OS). Similarly, a page-in is required for a write if DB2 needs to write the contents of a buffer to disk and the buffer is in auxiliary storage.

If, for a pool, the PAGE-INS REQUIRED FOR READ and PAGE-INS REQUIRED FOR WRITE fields both contain zeros, it is likely that the pool, from a memory perspective, is "V=R" anyway (that is to say, the amount of real storage occupied by the pool is probably very close to, if not the same as, its size in terms of virtual storage). In that case, going with PGFIX(YES) should deliver CPU savings without increasing pressure on the server memory resource, since the page frames being stolen are probably not those that are occupied by that pool's buffers. If you want an added measure of assurance on this score, issue a -DISPLAY BUFFERPOOL DETAIL(*) command. The (*) following the DETAIL keyword tells DB2 that you want statistics for the pool since the time it was last allocated. That might have been days, or even weeks, ago (the command output will tell you this), and if you see that the "PAGE-INS REQ" fields in the read and write parts of the command output contain zeros for that long period of time, it's a REALLY good bet that the pool's occupation of real storage won't increase appreciably if you go with PGFIX(YES). For even MORE assurance that the memory resource of the z/OS LPAR in which DB2 is running is not under a lot of pressure, check the "PAGE-INS REQUIRED" numbers for the lower-activity pools (those with fewer GETPAGE requests than others). If even these show zeros, you should be in really good shape, memory-wise.

With all this said, keep a couple of things in mind. First, even though your "PAGE-INS REQUIRED" numbers may give you a high degree of confidence that going to PGFIX(YES) for a buffer pool would be a good idea, make sure to coordinate this action with your z/OS systems programmer. That person has responsibility for seeing that z/OS system resources (such as server memory) are effectively managed and utilized, and you need to make sure that the two of you are on the same page (no pun intended) regarding buffer pool page-fixing. If you've done your homework, and you let the z/OS systems programmer do his (or her) homework (such as looking at z/OS monitor-generated system paging statistics), getting to agreement should not be a problem. Second, be selective in your use of the PGFIX(YES) buffer pool option. The greater the amount of I/O activity for a pool, the greater the benefit of PGFIX(YES). I'd recommend considering page-fixing for pools for which the rate of disk I/O activity is at least in the high double digits (writes plus reads) per second (and be sure to include prefetch reads when calculating the rate of disk I/O operations for a buffer pool). By staying with PGFIX(NO) for your lower-activity pools, you ensure that DB2 will make some buffer pool-associated page frames available to z/OS for page-out, should something cause the LPAR's memory resource to come under significant pressure.

And for you data sharing users... Just a couple of weeks ago, someone told me that he was under the impression that page-fixing buffer pools would have a negative performance impact in a DB2 data sharing environment. NOT SO. Assuming (as mentioned above) that your server memory resource is sufficient to accommodate page-fixing for one or more of your buffer pools, the resulting CPU efficiency benefit should be MORE pronounced for in a data sharing group versus a standalone DB2 system. How so? Simple: the buffer pool page fix/page release activity that occurs for DB2 reads to, and writes from, the disk subsystem with PGFIX(NO) in effect also occurs for writes of pages to, and reads of pages from, coupling facility group buffer pools. Like disk I/Os, page read and write actions involving a group buffer pool can number in the thousands per second. PGFIX(YES) eliminates the overhead of page fix/page release requests for disk I/Os AND for group buffer pool page reads and writes. So, if you're running DB2 in a data sharing configuration, you have another incentive to check out the page-fix option for your high-use buffer pools.


Blogger Robert Catterall said...

A friend of mine in IBM's DB2 for z/OS development organization rightly pointed out to me that in gauging the potential for CPU savings from a switch to PGFIX(YES) for a buffer pool, you should check the number of PAGES per second read into, and written out from, the buffer pool -- not the number of read and write I/O requests, as I suggested in my blog entry. What I had neglected to consider is the fact that, for a multi-page I/O operation (e.g., a prefetch read or a multi-page write), DB2 has to request the buffer fix and release actions FOR EACH PAGE INVOLVED IN THE I/O OPERATION. So, if 32 pages were read into a PGFIX(NO) buffer pool by way of a dynamic prefetch read, DB2 would have to make 32 requests of z/OS to fix in memory and then subsequently release the buffers to which the 32 pages brought in from disk are assigned. Thus it is that PGFIX(YES) can have a particularly beneficial effect for pools that have high levels of prefetch I/O activity.

March 11, 2010 at 7:10 PM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home