Robert's Blog


Monday, February 22, 2010

A Couple of Notes on DB2 Group Buffer Pools

I have recently done some work related to DB2 for z/OS data sharing, and that has me wanting to share with you a couple of items of information concerning group buffer pools (coupling facility structures used to cache changed pages of tablespaces and indexes that are targets of inter-DB2 read/write interest). First I'll provide some thoughts on group buffer pool sizing. After that, I'll get into the connection between local buffer pool page-fixing and group buffer pool read and write activity. [Lingo alert: GBP is short for group buffer pool, and "GBP-dependent" basically means that there is inter-DB2 read/write interest in a page set (i.e., a tablespace or or an index or a partition).]

How do you know if bigger is better? A lot of folks know that a group buffer pool should be at least large enough to prevent directory entry reclaims (reclaims are basically "steals" of in-use GBP directory entries to accommodate registration of newly, locally cached pages of GBP-dependent page sets, and you want to avoid them because they result in invalidation of "clean" pages cached in local buffer pools). The key to avoiding directory entry reclaims is to have enough directory entries in a GBP to register all the different pages that could be cached in the GBP and in the associated local buffer pools at any one time (you also want to make sure that there are no GBP write failures due to lack of storage, but there won't be if the GBPs are large enough to prevent directory entry reclaims). For a GBP associated with a 4K buffer pool, and with the default 5:1 ratio of directory entries to data entries, sizing to prevent directory entry reclaims is pretty simple: you add up the size of the local pools and divide that figure by three to get your group buffer pool size; so, if there are two members in a data sharing group, and if BP1 has 6000 buffers on each member, directory entry reclaims will not occur if the size of GBP1 is at least 16,000 KB (the size of BP1 on each of the two DB2 members is 6000 X 4 KB = 24,000 KB, so the GBP1 size should be at least (2 X 24,000 KB) / 3, which is 16,000 KB). Let's say that your GBPs are all large enough to prevent directory entry reclaims (you can check on this via the output of the DB2 command -DISPLAY GROUPBUFFERPOOL GDETAIL). If you have enough memory in your coupling facility LPARs to make them larger still, should you? If you do enlarge them, how do you know if you've done any good?

Start by checking on the success rate for GBP reads caused by buffer invalidations (when a local buffer of DB2 member X holds a table or index page that is changed by a process running on DB2 member Y, the buffer in member X's local pool will be marked invalid and a subsequent request for that page will cause member X to request the current version of the page, first from the GBP and then, in case of a "not found" result, from the disk subsystem). Information about these GBP reads can be found in a DB2 monitor report or online display of GBP information, or in the output of a -DISPLAY GROUPBUFFERPOOL MDETAIL command. In a DB2 monitor report the fields of interest may be labeled as follows (field names can vary slightly from one monitor product to another -- note that "XI" is short for "cross-invalidation," which refers to buffer invalidation operations):

GROUP BP1..........................QUANTITY
---------------------------........--------
SYN.READS(XI)-DATA RETURNED............8000
SYN.READS(XI)-NO DATA RETURN...........2000

In -DISPLAY GROUPBUFFERPOOL MDETAIL output, you'd be looking for this:

DSNB773I - MEMBER DETAIL STATISTICS
.............SYNCHRONOUS READS
...............DUE TO BUFFER INVALIDATION
.................DATA RETURNED..................= 8000
.................DATA NOT RETURNED..............= 2000

The success rate, or "hit rate," for these GBP reads would be:

(reads with data returned) / ((reads with data returned) + (reads with data not returned))

Using the numbers from the example output above, the success rate for GBP reads due to buffer invalidation would be 8000 / (8000 + 2000) = 80%.

Here's why this ratio is useful: buffer invalidations occur when a GBP directory entry pointing to a buffer is reclaimed (not good, as previously mentioned), or when a page cached locally in one DB2 member's buffer pool is changed by a process running on another member of the data sharing group (these invalidations are good, in that they are required for the preservation of data coherency in a data sharing environment). If you don't have any buffer invalidations resulting from directory entry reclaims, invalidations are occurring because of page update activity. Because updated pages of GBP-dependent pages sets are written to the associated GBP as part of commit processing, a DB2 member looking for an updated page in a GBP should have a reasonably good shot at finding it there, if the GBP is large enough to provide a decent page residency time.

So, if you make a GBP bigger and you see that the hit ratio for GBP reads due to invalid buffer has gone up for the member DB2 subsystems, you've probably helped yourself out, performance-wise, because GBP checks for current versions of updated pages are more often resulting in "page found" situations. Getting a page from disk is fast, but getting it from the GBP is 2 orders of magnitude faster (3 orders of magnitude if you have to get the page from spinning disk versus disk controller cache).

By the way, the hit ratio for GBP reads due to "page not in buffer pool" (labeled as such in -DISPLAY GROUPBUFFERPOOL MDETAIL output, and as something like SYN.READS(NF) in a DB2 monitor report or display) is not so useful in terms of gauging the effect of a GBP size increase. These numbers reflect GBP reads that occur when DB2 member is looking in the GBP for a page it needs and which it doesn't have in a local buffer pool. This has to be done prior to requesting the page from disk if the target page set is GBP-dependent, but a GBP "hit" for such a read is, generally speaking, not very likely.

One more thing: if you make a GBP bigger and you are duplexing your GBPs (and I hope that you are), be sure to enlarge the secondary GBP along with the primary GBP. If you aren't duplexing your GBPs (and why is that?), make sure that all your structures can still fit in one CF LPAR (in a two-CF configuration) after the target GBP has been made larger.

Buffer pool page-fixing: good for more than disk I/Os. Buffer pool page-fixing, introduced with DB2 for z/OS V8, is one of my favorite recent DB2 enhancements (I blogged about it in an entry posted in 2008). People tend to think of the performance benefit of buffer pool page-fixing as it relates to disk I/O activity. That benefit is definitely there, but so is the benefit -- and this is what lots of people don't think about -- associated with GBP read and write activity. See, every time DB2 writes a page to a GBP or reads a page from a GBP, the local buffer involved in the operation must be fixed in server memory (aka central storage). If the buffer is in a pool for which PGFIX(YES) has been specified, that's already been done; otherwise, DB2 will have to tell z/OS to fix the buffer in memory during the GBP read or write operation and then release the buffer afterwards. A single "fix" or "un-fix" request is inexpensive, CPU-wise, but there can be hundreds of page reads and writes per second for a GBP, and the cumulative cost of all that buffer fixing and un-fixing can end up being rather significant. So, if you are running DB2 in data sharing mode and you aren't yet taking advantage of buffer pool page-fixing, now you have another reason to give it serious consideration.

Tuesday, February 9, 2010

Good News on the Mainframe DB2 Data Warehousing Front

Last week, I attended a 1-day IBM System z "Technology Summit" education event in Atlanta. It was a multi-track program, and the DB2 for z/OS track ("Track 2") was excellent, in terms of both content and quality of presentation delivery (and it was FREE -- check out the remaining North American cities and dates for this event at http://www-01.ibm.com/software/os/systemz/summit/). The first presentation of the day, delivered by Jim Reed of IBM's Information Management software organization, focused on mainframe market trends and IBM's DB2 for z/OS product strategy. Jim's talk contained several nuggets of information that underscore the solid present and bright future of DB2 for z/OS as a platform for business intelligence applications. In this post, I'll share this information with you, along with some of my own observations on the topic.

BI important? How about most important? Jim started out with a reference to a recent IBM Global CIO survey which asked participants to identify their top priority. You know what's hot in IT circles these days: virtualization, mobility apps, regulatory compliance. So, what came out on top with regard to CIO priorities? Analytics and business intelligence. That's not very surprising, as far as I'm concerned. Having spent years on optimizing efficiency, squeezing costs out of every facet of their operations, organizations are increasingly focused on optimizing performance. Are they offering the right mix of products and services to their customers? Are they selling to the right people? Are they delivering value in a way that separates them, in the eyes of their customers, from their competitors? Data warehouse systems are key drivers of success here, enabling companies to generate actionable intelligence from their data assets (and the breadth of these data assets keeps expanding, including now not just traditional point-of-sale and other business transactions, but e-mails, customer care interactions -- even company- and product-related comments posted on external-to-the-enterprise Web sites).

Big iron has big mo. At the same time that BI is heating up as an area of corporate endeavor, the mainframe -- long seen as a workhorse for run-the-business OLTP and batch workloads -- is growing in popularity as a platform for BI applications. Jim spoke of several factors that are putting wind in System z's sails with regard to data warehousing. He cited a Gartner report that spotlighted key BI issues with which companies are grappling now. This list of front-burner concerns included:
  • High availability. Data warehouses are more likely these days to get a "mission critical" designation. Many (including one I worked on just last month) are customer-facing systems, and a lot of those are the subject of service level agreements.
  • Mixed workload performance. This was identified as the number one performance issue for data warehouses. Mixed BI workloads, in which fast-running, OLTP-like queries vie with complex, data-intensive analytic processes for system resources, are becoming common as so-called "operational BI" gains prominence.
Then, of course, there's the matter of data protection, on which so much else depends. Jim mentioned that 33% of people recently surveyed indicated that they would QUIT doing business with a company if that company experienced a data security or privacy breach and was seen as being responsible for the incident.

So, to address these key issues, you'd probably want to build your data warehouse on a hardware/software platform known for high availability, sophisticated workload management capabilities, and strong, multi-layered data protection and access control. Hmmm. Sounds like mainframe DB2 to me. Keep in mind, too, that the well-known availability and workload management strengths of System z and z/OS and DB2 are made even stronger when DB2 is deployed in data sharing mode on a parallel sysplex mainframe cluster configuration.

Oh, and let's not forget that the legendary reliability of mainframe systems is not just a matter of advanced hardware and software technology (good as that stuff is) -- it also reflects the deep skills and robust processes (around change management, performance monitoring and tuning, capacity planning, business continuity, etc.) that typify the teams of professionals that support organizations' mainframe computing environments. As BI applications continue to move from "nice to have" to "must have" in the eyes of corporate leaders, it stands to reason that IT executives would want to house these essential systems on the server platform that exemplifies "rock solid," and to assign their care to the people in whom they have the utmost trust and confidence -- mainframe people.

One more trend driving BI workloads to System z is the increased frequency with which data warehouse databases are being updated. Not long ago, the "query by day, update at night" model predominated. Now, many BI application users demand that updates of source data values be reflected more quickly in the data warehouse database -- sometimes in a near-real-time manner. A lot of the source data that supplies data warehouses comes from mainframe databases and files, and locating the data warehouse close to that source data can facilitate round-the-clock updating.

Let's make a deal. The technical arguments for building a data warehouse on a mainframe platform are many and strong, but what about the financial angle? IBM has been pretty busy in this area of late. I already knew of DB2 Value Unit Edition pricing, which makes DB2 for z/OS available for a one-time charge for net new workloads of certain types, including data warehousing. I'll admit to not having known about IBM's System z Solution Edition series (announced in August of last year) before Jim talked about them during his presentation. Included in this set of offerings is the System Z Solution Edition for Data Warehousing, a package of hardware, software (including DB2), and services that can help an organization to implement a mainframe-based data warehouse system in a cost-competitive way.

If your organization is serious about data warehousing, get serious about your data warehouse platform. Mainframes deliver the availability, mixed workload performance management, security, and -- yes -- total cost of ownership that can improve your chances of achieving BI success. Analyze that.