A Couple of Notes on DB2 Group Buffer Pools
I have recently done some work related to DB2 for z/OS data sharing, and that has me wanting to share with you a couple of items of information concerning group buffer pools (coupling facility structures used to cache changed pages of tablespaces and indexes that are targets of inter-DB2 read/write interest). First I'll provide some thoughts on group buffer pool sizing. After that, I'll get into the connection between local buffer pool page-fixing and group buffer pool read and write activity. [Lingo alert: GBP is short for group buffer pool, and "GBP-dependent" basically means that there is inter-DB2 read/write interest in a page set (i.e., a tablespace or or an index or a partition).]
How do you know if bigger is better? A lot of folks know that a group buffer pool should be at least large enough to prevent directory entry reclaims (reclaims are basically "steals" of in-use GBP directory entries to accommodate registration of newly, locally cached pages of GBP-dependent page sets, and you want to avoid them because they result in invalidation of "clean" pages cached in local buffer pools). The key to avoiding directory entry reclaims is to have enough directory entries in a GBP to register all the different pages that could be cached in the GBP and in the associated local buffer pools at any one time (you also want to make sure that there are no GBP write failures due to lack of storage, but there won't be if the GBPs are large enough to prevent directory entry reclaims). For a GBP associated with a 4K buffer pool, and with the default 5:1 ratio of directory entries to data entries, sizing to prevent directory entry reclaims is pretty simple: you add up the size of the local pools and divide that figure by three to get your group buffer pool size; so, if there are two members in a data sharing group, and if BP1 has 6000 buffers on each member, directory entry reclaims will not occur if the size of GBP1 is at least 16,000 KB (the size of BP1 on each of the two DB2 members is 6000 X 4 KB = 24,000 KB, so the GBP1 size should be at least (2 X 24,000 KB) / 3, which is 16,000 KB). Let's say that your GBPs are all large enough to prevent directory entry reclaims (you can check on this via the output of the DB2 command -DISPLAY GROUPBUFFERPOOL GDETAIL). If you have enough memory in your coupling facility LPARs to make them larger still, should you? If you do enlarge them, how do you know if you've done any good?
Start by checking on the success rate for GBP reads caused by buffer invalidations (when a local buffer of DB2 member X holds a table or index page that is changed by a process running on DB2 member Y, the buffer in member X's local pool will be marked invalid and a subsequent request for that page will cause member X to request the current version of the page, first from the GBP and then, in case of a "not found" result, from the disk subsystem). Information about these GBP reads can be found in a DB2 monitor report or online display of GBP information, or in the output of a -DISPLAY GROUPBUFFERPOOL MDETAIL command. In a DB2 monitor report the fields of interest may be labeled as follows (field names can vary slightly from one monitor product to another -- note that "XI" is short for "cross-invalidation," which refers to buffer invalidation operations):
GROUP BP1..........................QUANTITY
---------------------------........--------
SYN.READS(XI)-DATA RETURNED............8000
SYN.READS(XI)-NO DATA RETURN...........2000
In -DISPLAY GROUPBUFFERPOOL MDETAIL output, you'd be looking for this:
DSNB773I - MEMBER DETAIL STATISTICS
.............SYNCHRONOUS READS
...............DUE TO BUFFER INVALIDATION
.................DATA RETURNED..................= 8000
.................DATA NOT RETURNED..............= 2000
The success rate, or "hit rate," for these GBP reads would be:
(reads with data returned) / ((reads with data returned) + (reads with data not returned))
Using the numbers from the example output above, the success rate for GBP reads due to buffer invalidation would be 8000 / (8000 + 2000) = 80%.
Here's why this ratio is useful: buffer invalidations occur when a GBP directory entry pointing to a buffer is reclaimed (not good, as previously mentioned), or when a page cached locally in one DB2 member's buffer pool is changed by a process running on another member of the data sharing group (these invalidations are good, in that they are required for the preservation of data coherency in a data sharing environment). If you don't have any buffer invalidations resulting from directory entry reclaims, invalidations are occurring because of page update activity. Because updated pages of GBP-dependent pages sets are written to the associated GBP as part of commit processing, a DB2 member looking for an updated page in a GBP should have a reasonably good shot at finding it there, if the GBP is large enough to provide a decent page residency time.
So, if you make a GBP bigger and you see that the hit ratio for GBP reads due to invalid buffer has gone up for the member DB2 subsystems, you've probably helped yourself out, performance-wise, because GBP checks for current versions of updated pages are more often resulting in "page found" situations. Getting a page from disk is fast, but getting it from the GBP is 2 orders of magnitude faster (3 orders of magnitude if you have to get the page from spinning disk versus disk controller cache).
By the way, the hit ratio for GBP reads due to "page not in buffer pool" (labeled as such in -DISPLAY GROUPBUFFERPOOL MDETAIL output, and as something like SYN.READS(NF) in a DB2 monitor report or display) is not so useful in terms of gauging the effect of a GBP size increase. These numbers reflect GBP reads that occur when DB2 member is looking in the GBP for a page it needs and which it doesn't have in a local buffer pool. This has to be done prior to requesting the page from disk if the target page set is GBP-dependent, but a GBP "hit" for such a read is, generally speaking, not very likely.
One more thing: if you make a GBP bigger and you are duplexing your GBPs (and I hope that you are), be sure to enlarge the secondary GBP along with the primary GBP. If you aren't duplexing your GBPs (and why is that?), make sure that all your structures can still fit in one CF LPAR (in a two-CF configuration) after the target GBP has been made larger.
Buffer pool page-fixing: good for more than disk I/Os. Buffer pool page-fixing, introduced with DB2 for z/OS V8, is one of my favorite recent DB2 enhancements (I blogged about it in an entry posted in 2008). People tend to think of the performance benefit of buffer pool page-fixing as it relates to disk I/O activity. That benefit is definitely there, but so is the benefit -- and this is what lots of people don't think about -- associated with GBP read and write activity. See, every time DB2 writes a page to a GBP or reads a page from a GBP, the local buffer involved in the operation must be fixed in server memory (aka central storage). If the buffer is in a pool for which PGFIX(YES) has been specified, that's already been done; otherwise, DB2 will have to tell z/OS to fix the buffer in memory during the GBP read or write operation and then release the buffer afterwards. A single "fix" or "un-fix" request is inexpensive, CPU-wise, but there can be hundreds of page reads and writes per second for a GBP, and the cumulative cost of all that buffer fixing and un-fixing can end up being rather significant. So, if you are running DB2 in data sharing mode and you aren't yet taking advantage of buffer pool page-fixing, now you have another reason to give it serious consideration.
How do you know if bigger is better? A lot of folks know that a group buffer pool should be at least large enough to prevent directory entry reclaims (reclaims are basically "steals" of in-use GBP directory entries to accommodate registration of newly, locally cached pages of GBP-dependent page sets, and you want to avoid them because they result in invalidation of "clean" pages cached in local buffer pools). The key to avoiding directory entry reclaims is to have enough directory entries in a GBP to register all the different pages that could be cached in the GBP and in the associated local buffer pools at any one time (you also want to make sure that there are no GBP write failures due to lack of storage, but there won't be if the GBPs are large enough to prevent directory entry reclaims). For a GBP associated with a 4K buffer pool, and with the default 5:1 ratio of directory entries to data entries, sizing to prevent directory entry reclaims is pretty simple: you add up the size of the local pools and divide that figure by three to get your group buffer pool size; so, if there are two members in a data sharing group, and if BP1 has 6000 buffers on each member, directory entry reclaims will not occur if the size of GBP1 is at least 16,000 KB (the size of BP1 on each of the two DB2 members is 6000 X 4 KB = 24,000 KB, so the GBP1 size should be at least (2 X 24,000 KB) / 3, which is 16,000 KB). Let's say that your GBPs are all large enough to prevent directory entry reclaims (you can check on this via the output of the DB2 command -DISPLAY GROUPBUFFERPOOL GDETAIL). If you have enough memory in your coupling facility LPARs to make them larger still, should you? If you do enlarge them, how do you know if you've done any good?
Start by checking on the success rate for GBP reads caused by buffer invalidations (when a local buffer of DB2 member X holds a table or index page that is changed by a process running on DB2 member Y, the buffer in member X's local pool will be marked invalid and a subsequent request for that page will cause member X to request the current version of the page, first from the GBP and then, in case of a "not found" result, from the disk subsystem). Information about these GBP reads can be found in a DB2 monitor report or online display of GBP information, or in the output of a -DISPLAY GROUPBUFFERPOOL MDETAIL command. In a DB2 monitor report the fields of interest may be labeled as follows (field names can vary slightly from one monitor product to another -- note that "XI" is short for "cross-invalidation," which refers to buffer invalidation operations):
GROUP BP1..........................QUANTITY
---------------------------........--------
SYN.READS(XI)-DATA RETURNED............8000
SYN.READS(XI)-NO DATA RETURN...........2000
In -DISPLAY GROUPBUFFERPOOL MDETAIL output, you'd be looking for this:
DSNB773I - MEMBER DETAIL STATISTICS
.............SYNCHRONOUS READS
...............DUE TO BUFFER INVALIDATION
.................DATA RETURNED..................= 8000
.................DATA NOT RETURNED..............= 2000
The success rate, or "hit rate," for these GBP reads would be:
(reads with data returned) / ((reads with data returned) + (reads with data not returned))
Using the numbers from the example output above, the success rate for GBP reads due to buffer invalidation would be 8000 / (8000 + 2000) = 80%.
Here's why this ratio is useful: buffer invalidations occur when a GBP directory entry pointing to a buffer is reclaimed (not good, as previously mentioned), or when a page cached locally in one DB2 member's buffer pool is changed by a process running on another member of the data sharing group (these invalidations are good, in that they are required for the preservation of data coherency in a data sharing environment). If you don't have any buffer invalidations resulting from directory entry reclaims, invalidations are occurring because of page update activity. Because updated pages of GBP-dependent pages sets are written to the associated GBP as part of commit processing, a DB2 member looking for an updated page in a GBP should have a reasonably good shot at finding it there, if the GBP is large enough to provide a decent page residency time.
So, if you make a GBP bigger and you see that the hit ratio for GBP reads due to invalid buffer has gone up for the member DB2 subsystems, you've probably helped yourself out, performance-wise, because GBP checks for current versions of updated pages are more often resulting in "page found" situations. Getting a page from disk is fast, but getting it from the GBP is 2 orders of magnitude faster (3 orders of magnitude if you have to get the page from spinning disk versus disk controller cache).
By the way, the hit ratio for GBP reads due to "page not in buffer pool" (labeled as such in -DISPLAY GROUPBUFFERPOOL MDETAIL output, and as something like SYN.READS(NF) in a DB2 monitor report or display) is not so useful in terms of gauging the effect of a GBP size increase. These numbers reflect GBP reads that occur when DB2 member is looking in the GBP for a page it needs and which it doesn't have in a local buffer pool. This has to be done prior to requesting the page from disk if the target page set is GBP-dependent, but a GBP "hit" for such a read is, generally speaking, not very likely.
One more thing: if you make a GBP bigger and you are duplexing your GBPs (and I hope that you are), be sure to enlarge the secondary GBP along with the primary GBP. If you aren't duplexing your GBPs (and why is that?), make sure that all your structures can still fit in one CF LPAR (in a two-CF configuration) after the target GBP has been made larger.
Buffer pool page-fixing: good for more than disk I/Os. Buffer pool page-fixing, introduced with DB2 for z/OS V8, is one of my favorite recent DB2 enhancements (I blogged about it in an entry posted in 2008). People tend to think of the performance benefit of buffer pool page-fixing as it relates to disk I/O activity. That benefit is definitely there, but so is the benefit -- and this is what lots of people don't think about -- associated with GBP read and write activity. See, every time DB2 writes a page to a GBP or reads a page from a GBP, the local buffer involved in the operation must be fixed in server memory (aka central storage). If the buffer is in a pool for which PGFIX(YES) has been specified, that's already been done; otherwise, DB2 will have to tell z/OS to fix the buffer in memory during the GBP read or write operation and then release the buffer afterwards. A single "fix" or "un-fix" request is inexpensive, CPU-wise, but there can be hundreds of page reads and writes per second for a GBP, and the cumulative cost of all that buffer fixing and un-fixing can end up being rather significant. So, if you are running DB2 in data sharing mode and you aren't yet taking advantage of buffer pool page-fixing, now you have another reason to give it serious consideration.