Robert's Blog

Thursday, August 14, 2008

Some Interesting DB2 for z/OS Data Sharing Trends and Issues

Once again, I've let more time than usual pass since the last post to my blog, the cause (again) being a period of particular busyness on my part.

I've just finished teaching a DB2 data sharing implementation class for a large company in the health-care industry. The experience had me thinking a lot about the way data sharing worked, and how organizations used it, back when the functionality was introduced with DB2 Version 4 for OS/390 (predecessor to z/OS). I'll share some of those thoughts in this entry.

Prime motivation: from capacity to availability. When DB2 data sharing showed up in the mid-1990s, IBM had recently decided to shift from very expensive (and very powerful) bipolar chip sets in its mainframe "engines" to less expensive (and -- at first -- much slower) CMOS microprocessors (the same integrated circuit technology used in PCs and other so-called distributed systems servers). In order to support what were then the world's largest mainframe applications, big companies -- manufacturers, banks, brokerages, retailers, railroads, and package delivery firms, to name some of the represented industries -- had to harness the capacity of several of the new CMOS mainframes in a single-logical-image fashion. The mainframe cluster was (and is) called a parallel sysplex, and DB2 data sharing leveraged that shared-disk clustering architecture to enable multiple DB2 subsystems on multiple servers to concurrently access a single database in read/write mode.

Well, the CMOS-based processors in today's mainframes are way faster than the bipolar variety ever were. Not only that, but you can get way more of them in one server than you could back in the nineties -- and tons more memory, to boot: the current top of the mainframe line, the IBM z10, can be configured with up to 64 engines and 1.5 terabytes of central storage. Even for a huge DB2 data-serving workload, it's likely that few organizations would need processing capacity beyond what's available with one these bad boys. Still, companies continue to implement parallel sysplexes and DB2 data sharing. Why? Availability, my friend. With a DB2 data sharing group, planned outages for DB2 maintenance (or z/OS or hardware maintenance) can be virtually eliminated, as you can apply software or hardware fixes with ZERO database downtime. DB2 data sharing on a parallel sysplex can also greatly reduce the impact of unplanned outages. A company with which I worked had a DB2 subsystem in a data sharing group fail (something that hadn't happened in a LONG time), and system users didn't even notice the failure: database access continued via the other DB2 subsystems in the group, while the failing DB2 member was automatically (and quickly) restarted by the operating system.

The primary motivation for implementing a DB2 data sharing group these days is the quest for ultra-high availability.

Plenty of DB2 subsystems in a group, but fewer hardware "footprints." A few years ago, it was not unusual to find three or more mainframes clustered in a parallel sysplex. With the processing capacity of individual mainframe "boxes" getting to be so large (through a combination of faster engines and more engines per server), organizations increasingly opt for two-mainframe sysplexes (also contributing to the decrease in hardware "footprints" within parallel sysplex configurations: the growing use of internal coupling facilities -- running in logical partitions within mainframe servers -- versus standalone external coupling facilities). Interestingly, as the number of physical boxes in companies' sysplexes have declined in number, the number of DB2 subsystems in the data sharing groups running on these parallel sysplexes has often stayed the same or even gone up. I know of an organization that runs a nine-way DB2 data sharing group on a two-mainframe parallel sysplex. Why so many? There are several reasons:
  • Having at least two DB2 subsystems per mainframe in the sysplex allows you to fully utilize the processing capacity of each mainframe even when you have a DB2 subsystem down for maintenance (recall that more and more organizations are using DB2 data sharing to enable the application of hardware and software maintenance without the need for a maintenance "window."
  • When a DB2 subsystem in a data sharing group fails, the X-type locks (aka modify locks) held by that subsystem at the time of the failure are retained until that subsystem can be restarted (usually automatically, either in-place or on another server in the parallel sysplex) to
    free them up (this is done to protect the integrity of the database). If the data sharing group has more members, the number of retained locks held by a given member in the event of a failure is likely to be smaller, reducing the impact of the failure event. Additionally, having the same workload spread across more members could speed up restart time for a failed member, as there might be somewhat less data change roll-forward and rollback work to do during restart.
  • Having more members in a data sharing group reduces the log-write load per member, as each member writes log records only for changes made by programs that execute on the member.
  • The cost of going from n-way to n+1-way data sharing (once you've gone to 2-way data sharing) is VERY small, so the overhead cost of having more DB2 subsystems in a data sharing group is typically pretty insignificant.
Binding programs with RELEASE(DEALLOCATE) is not the data sharing recommendation it once was. Prior to DB2 for z/OS Version 8, XES (cross-system extended services, the component of the operating system that handles interaction with coupling facility structures such as the global lock structure) would perceive that an IS lock on tablespace XYZ held by a program running on DB2A is incompatible with an IX lock requested for the same tablespace by a program running on DB2 B, when in fact the two locks are compatible. The local lock managers associated with DB2A and DB2B (i.e., the IRLMs) would figure out that the two locks are not in conflict with each other, but only after some inter-system communication that drove up overhead. In order to avoid incidences of such perceived-but-not-real global lock contention (called XES contention), people would bind programs with RELEASE(DEALLOCATE) to have them retain tablespace locks across commits .In a related move, people would seek to use more transactional threads that last through commits, such as CICS protected entry threads (batch threads automatically persist through commits, deallocating only at end-of-job).

DB2 Version 8 introduced a clever new global locking protocol that eliminates perceived IX-IS and IX-IX inter-system tablespace lock contention. One effect of this development is that the decision on whether to bind programs with RELEASE(DEALLOCATE) or RELEASE(COMMIT) is now pretty much unrelated to data sharing. Do what you'd do (or as you've done) in a non-data sharing DB2 environment.

The "double-failure" scenario is not so scary anymore. What I'm referring to here is the situation that can arise if the global lock structure used by a data sharing group is located in an internal coupling facility (ICF) that is part of a mainframe on which at least one DB2 member of the same data sharing group is running. In that case, if the whole box (the mainframe server) fails, both the lock structure and a related DB2 subsystem fail. In that case, the lock structure can't be rebuilt because the rebuild process needs information from all of the DB2 members in the group, and -- as mentioned -- at least one of those DB2 subsystems failed when the mainframe failed. Without a lock structure, the whole group will fail, and a group restart will be necessary to get everything back again.

Nowadays, with (as pointed out earlier) many organizations having fewer (though much more powerful) mainframe servers than they'd had some years ago, and with many companies strongly preferring to use ICFs (they are attractively priced versus standalone external CFs), folks are wanting to implement two-mainframe parallel sysplexes with an ICF in each mainframe. That set-up makes the "double-failure" scenario a possibility. Know what I say to folks leaning towards this configuration? I tell them to go ahead and to not sweat a "double failure," because 1) it's exceedingly unlikely (remember, the whole mainframe server would have to fail, and on top of that, it'd have to be the one with the ICF holding the lock structure), 2) even if it does happen, the group restart will clean everything up so that no committed database changes are lost, 3) group restart is a better-performing process than it was in the earlier years of data sharing, when processors were slower and the recovery code was less sophisticated than it is in current versions of DB2, and 4) given items 1-3 in this list, I can understand why organizations would balk at paying extra -- either in the form of an external CF or the overhead of system duplexing of the lock structure -- to avoid a situation that has a very low probability of occurrence and that does not (in my opinion) constitute a "disaster" even if it does occur.

So, the DB2 data sharing landscape has changed since DB2 Version 4. Here's something that hasn't changed: DB2 data sharing on a parallel sysplex is the most highly available and scalable data-serving platform on the market. It's great technology that just keeps getting better.


Blogger Rick Butler said...

Hi Robert, very interesting article, thanks.
Here is a link to a relevant recent (july 17) Redpaper

Title: "DB2 9 for z/OS Data Sharing: Distributed Load Balancing and Fault Tolerant Configuration"

This IBM® Redpaper provides information about exploiting client load balancing and fail over capabilities across a DB2® data sharing group or a subset of the group members.
The information provided will help network specialists and data base administrators in appropriately configuring data sharing groups as TCP/IP servers.
We show how to use the functions of DB2 sysplex workload balancing functions, supported by DB2 Connect™ Server and the IBM DB2 Driver for JDBC and SQLJ, in conjunction with the functions of TCP/IP Sysplex Distributor, configured with Dynamic Virtual IP address (Dynamic VIPA or DVIPA) and automatic VIPA takeover, to provide superior load distribution and fail-over capability for transactions and connections across the members of a data sharing group. The two functions work together to ensure the highest availability possible for remote applications accessing a DB2 location in a sysplex.

August 20, 2008 at 1:53 PM  
Blogger Robert Catterall said...

Looks like an interesting Redpaper. Thanks for the link and the abstract, Rick.

August 25, 2008 at 3:17 PM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home