Robert's Blog


Monday, January 26, 2009

DB2 Data Sharing and Row-Level Locking

As a member of IBM's DB2 National Technical Support team in the 1990s, I worked with data sharing technology before DB2 Version 4 (which introduced the feature) was generally available. An important part of my job at the time was to boost people's understanding of data sharing, and that included clearing up related misperceptions. Some of those misperceptions persist to this day, most notably the one about row-level locking in a data sharing environment. Just last week I was reminded that a number of people, including some who work for organizations that run DB2 for z/OS in data sharing mode, believe that row-level locking cannot be used in a data sharing system. I'm not talking about people who think that you shouldn't use row-level locking with DB2 data sharing. I'm talking about people who feel that you can't use row-level locking in that environment, even if you want to - that a DB2 data sharing system somehow can't process a request to globally lock table data at a row level. This is most definitely not the case, and I hope to make that clear via this blog entry.

First, a declaration: there is NO kind of lock request that DB2 can make on a program's behalf in a standalone-subsystem environment that it can't handle in a data sharing group on a parallel sysplex ("parallel sysplex" being IBM's term for the mainframe-cluster technology that provides the infrastructure on which DB2 data sharing operates). Tablespace-level locks? Table-level? Page-level? Row level? All are available whether DB2 is running in data sharing or non-data sharing mode. To put it another way, at the logical level (i.e., from an application programming perspective) there is no difference between locking in a data sharing environment and locking in a standalone DB2 system.

This is not to say that DB2 data sharing doesn't carry with it any changes with regard to data locking. In fact data sharing introduced many locking changes, but these basically amounted to additions to the lock-related operations seen in a standalone DB2 environment. To the "regular" locks that exist in both data sharing and non-data sharing systems (and which are known as "logical locks" in data sharing-speak), DB2 data sharing adds what are called physical locks. This is where the row-level locking connection comes in.

When there is inter-DB2 read/write interest in database objects (i.e., when multiple members of a DB2 data sharing group have a given database object open for read/write - versus read-only - access), DB2 will issue page-level physical lock requests for pages that cannot be locked at the logical level. These include index pages, space map pages, and pages of tablespaces defined with LOCKSIZE ROW. Thus, an important point can be made: while concomitant physical lock (aka P-lock) activity makes row-level locking more expensive in a data sharing environment than it otherwise would be (as will be explained momentarily), row-level locking does not introduce page P-locking to a data sharing system; rather, it simply adds to the page P-lock activity that would be there anyway because of P-lock requests related to index and space map pages.

So, about the overhead cost of row-level locking in a DB2 data sharing environment: let's suppose that a data sharing group has two members, DB2A and DB2B. Tablespace TS1 in the shared database is defined with LOCKSIZE PAGE, and tablespace TS2 with LOCKSIZE ROW. Suppose further that both tablespaces are group buffer pool-dependent (meaning, basically, that both DB2A and DB2B have read/write interest in both of these database objects). Program ProgA1 on DB2A issues an UPDATE for row R1 in table T1 in tablespace TS1, and program ProgA2 (also connected to DB2A) does the same for row R2 in table T2 in tablespace TS2. Right after that, programs ProgB1 and ProgB2, running on DB2B, issue updates for rows R11 in table T1 and and R22 in table T2, respectively. Rows R1 and R11 are on the same page, P1, in table T1, and rows R2 and R22 are in the same page, P2, in table T2. In the case of program ProgB1, DB2B requests for it an X-type logical lock on page P1 of table T1. It can't immediately get that lock, because ProgA1 already has it (assuming that ProgA1 hasn't yet issued a COMMIT). ProgB1 waits, and gets the requested X-type logical lock on page P1 after ProgA1 commits its update operation. Meanwhile, DB2B requests for ProgB2 an X-type logical lock on row R22 in table T2, and ProgB2 gets this lock because ProgA2 has an X-type logical lock on a different row, R2, that happens to be (as previously mentioned) in the same data page. Because row-level locking is in effect for tablespace TS2, DB2B also requests an X-type P-lock (physical lock) on page P2 in table T2. It doesn't initially get that P-lock, because DB2A holds the X-type P-lock on page P2; however, DB2A transfers the X-type P-lock on page P2 to DB2B (this can be done for P-locks, even before a program has issued a COMMIT). As part of this page P-lock transfer, DB2A sends page P2 (including the updated row R2) to DB2B by way of the group buffer pool associated with tablespace TS2. When DB2B gets this page in its local buffer pool, the update of row R22 by ProgB2 proceeds.

With row-level locking in effect, there is some additional CPU consumption, versus the page-level locking scenario, owing to extra global lock requests (page-level P-locks in addition to row-level logical locks) and the on-demand transfer of page P2 from DB2A to DB2B (in the page-level locking scenario, the page updated on DB2A will get to DB2B as part of a process related to the committing of the data change made on DB2A). This brings up another important point: in many cases, row-level locking can be used in a non-data sharing environment with virtually no increase in CPU cost versus page-level locking; however, there will almost always be an extra CPU cost to be paid for row-level locking in a data sharing system. So, can you use row-level locking with DB2 data sharing? Yes. Should you use row-level locking with DB2 data sharing? That depends on whether the extra CPU cost so incurred provides a justifying benefit by eliminating or at least mitigating a concurrency problem.

How would you make that call? Here's what I would do: in a data sharing environment, I'd have all my tablespaces defined with LOCKSIZE PAGE (or LOCKSIZE ANY, which makes page-level locking the first option for DB2 but enables automatic escalation to a less-granular locking scope in some situations). Absent any serious contention problems (i.e., if your DB2 monitor does not show a lot of timeouts or deadlocks), that's where I'd leave things. If I did have a serious locking problem, I'd trace it to the target tablespaces (DB2 provides you with the information to do this). With the contention-challenged tablespaces identified, I'd consider going with row-level locking for those tablespaces, and I'd probably effect that change if the programs getting the deadlocks and/or timeouts couldn't be readily modified so as to resolve the concurrency issues. If using row-level for the tablespaces caused only a small increase in CPU overhead for programs accessing data in the tablespaces, I'd consider the problem solved (DB2 accounting trace data displayed via your monitor will give you cost-comparison information). If the overhead of row-level locking turned out to be too high, I'd either stay with it and keep the tablespaces from becoming group buffer pool dependent (there are ways to ensure this), or go back to page-level locking and try other approaches to resolving the concurrency problem (one of which could be the choice of a different clustering key for associated tables).

OK, quite a few words to make a simple point: in a DB2 data sharing environment, you can use row-level locking, but you should only use it if you need to in order to resolve a concurrency problem. If you do have a concurrency problem in a data sharing system, keep in mind that going with row-level locking is probably one of several mitigating actions that could be taken. Go with it if it's your best option.

Tuesday, January 20, 2009

DB2 9 for z/OS - Good News on the Stats Front

About 30 some-odd years ago, a team of IBM researchers (working on what would become DB2) invented cost-based SQL statement optimization (thank you, Patricia Selinger et al). This development ended up being a very big part of DB2's appeal in the marketplace, because it meant that the DBMS, given several alternative routes by which data qualified by a query could be retrieved, was able to select the access path that would cost the least in terms of system resources.

The part of DB2 code that performs access path selection is called the optimizer, and it bases its decisions largely on the database statistics that are generated through the RUNSTATS utility and stored in the DB2 catalog. In the early years, the optimizer was handicapped in that it did not have information regarding the distribution of values within a column of a table, and so had to assume an even distribution (e.g., if a table had 1 million rows and an associated column had 1000 values, DB2 would assume 1000 duplicates of each of the 1000 distinct values). This assumption could lead to sub-optimal access path selection decisions because in many cases a non-unique column has a very non-uniform distribution of data values (consider the previous example: 900,000 of the 1 million rows could contain one value of the column, with the other 999 distinct values spread across the remaining 100,000 rows). This "blind spot" regarding column data value distribution could lead to the calculation of inaccurate "filter factors" for query predicates, and that could lead to the selection of something other than the actual least-cost access path for a query.

As I recall, it was in the mid-1990s that DB2 and RUNSTATS (for the mainframe and for the Linux, UNIX, and Windows - aka LUW - platforms) were enhanced to provide the optimizer with information about the distribution of data values within a column. For both mainframe DB2 and DB2 for LUW, one could have RUNSTATS generate statistics pertaining to the percentage of rows in a column containing each of the "n" most frequently occurring of the column's values. Subsequent enhancements for the mainframe DB2 platform enabled one to also generate, through RUNSTATS, values for the percentage of a table's rows containing each of the "n" least-most frequently occurring values in a column, or stats for both the "n" most- and "n" least-frequently occurring of a column's values.

These frequency statistics were a great help to the optimizer, enabling DB2 to select better access paths for many queries referencing columns with non-uniform value distributions. [A number of mainframe DB2-using organizations are realizing the benefit of frequency-distribution statistics without even knowing it. When the DB2 for z/OS RUNSTATS utility is executed with the INDEX keyword on the control card, frequency-distribution statistics will automatically be generated for the top 10 most frequently-occurring values of the first column of the key of the targeted index. It's not uncommon for people to code a RUNSTATS TABLESPACE control card that specifies TABLE(ALL) INDEX(ALL), and in that case the utility will generate frequency distribution stats for the top 10 most frequently-occurring values of the first column of the key of every index on every table in the target tablespace.]

Still, there was room for improvement. Having statistics on the frequency of occurrence of the top 10, or 15, or 20 values in a column is great, but if a column has 100,000 values that can still lead, in some cases, to less-than-desired accuracy with respect to predicate filter-factor calculations and, by extension, sub-optimal access path selection. Collecting frequency-distribution statistics for the top, say, 10,000 of those 100,000 distinct values would effectively address that problem, but at a too-high cost in terms of RUNSTATS CPU consumption. A better solution was delivered in DB2 for z/OS V9 via the new HISTOGRAM option of RUNSTATS (DB2 for LUW users got histogram stats, generated via the WITH DISTRIBUTION option of the RUNSTATS command, some years ago).

When the RUNSTATS utility is executed with the HISTOGRAM option, DB2 will collect value frequency distribution statistics for a column (or group of columns) based on a certain number of quantiles (either user-specified or determined by DB2). Each quantile is associated with a range of column values, and each range includes about the same number of table rows. For a given quantile, DB2 knows the upper and lower values that bound the quantile, the number of distinct values within that range, and the number of rows in the value range. [This data is stored in catalog tables such as SYSCOLDIST and SYSCOLDISTSTATS (the latter used for partition-level statistics), and in new tables like SYSKEYTGTDIST and SYSKEYTGTDISTSTATS, which are used for so-called extended indexes (new with DB2 for z/OS V9, and something I'll probably cover in a future post to this blog).] Suppose, for example, that a table has 100,000 rows, and that histogram data for a certain column is gathered for 10 quantiles. Each quantile will represent about 10,000 rows. Let's say that quantile 1 is bounded by values 'AAA' and 'CCC' (assume CHAR for the data type) and contains 100 distinct values of the column. Quantile 2 is bounded by values 'DDD' and 'HHH' and contains 10 distinct values, and so on. In that case, DB2 would know that in the first (based on column value range) 10,000 rows of the table there are 100 distinct values of the column between 'AAA' and 'CCC' (an average of 100 duplicates per value), while in the second set of 10,000 rows there are 10 distinct values between 'DDD' and 'HHH' (an average of 1000 duplicates per value).

A big appeal of histogram statistics is that they are based on ALL values of a column (or a column group), whereas frequency statistics are based on a subset of values (usually a relatively small subset - this by practical necessity). This translates into richer catalog statistics, and with richer (i.e., more complete in terms of descriptiveness) stats to work with, DB2 is better able to optimize access path selection for queries.

So, what would I do in a DB2 for z/OS V9 environment? I'd probably start collecting histogram stats via RUNSTATS. If you currently code TABLE(ALL) INDEX(ALL) on your RUNSTATS TABLESPACE control card, you could just add HISTOGRAM to that. This would cause RUNSTATS to gather histogram statistics on the first column of the key of each index on every table in the tablespace, with DB2 determining the optimal number of quantiles on a table-by-table basis. Later, you could extend the gathering of histogram stats to other columns (or groups of columns) as you work to tune the performance of individual queries. Generating histogram statistics is likely to cause RUNSTATS to consume somewhat more CPU time than would be the case if you were requesting "top 10 values" frequency distribution stats, but that cost could be more than offset by the savings resulting from improvements in query access path selection.

Monday, January 12, 2009

DB2 for z/OS: Looking Again at RELEASE(DEALLOCATE) vs. RELEASE(COMMIT)

Quite a few years ago (probably 15 or more), I sat in a hotel meeting room during an IT conference and listened to a presentaion on mainframe DB2 performance delivered by IBM's Akira Shibamiya (still a DB2 performance guru extraordinaire). It was a typical Akira presentation, full of useful tips for improving the CPU efficiency of DB2 application programs. One of the tips offered: to reduce an SQL-issuing program's CPU consumption (by how much, I'll get to near the end of this blog entry), specify RELEASE(DEALLOCATE) instead of RELEASE(COMMIT) when binding the associated package (after consideration of some cautionary notes, such as the fact that tablespace locks are held until thread deallocation when RELEASE(DEALLOCATE) is used - a problem if a program explicitly requests an exclusive tablespace lock or if lock escalation occurs).

Good stuff. A nice addition to the performance-tuner's tool kit.

A couple of years or so after the aforementioned presentation event, IBM introduced DB2 data sharing on the mainframe platform. This led to renewed discussion of the RELEASE option of BIND PACKAGE among people implementing or evaluating the shared-data scale-out clustering technology. The reason: RELEASE(DEALLOCATE), used in conjunction with DB2 threads that would persist across commit operations (such as CICS protected entry threads), could significantly reduce a type of global lock contention called XES contention (as described in an earlier post to this blog). Less global lock contention meant less data sharing overhead, so RELEASE(DEALLOCATE) came to be quite widely used at data sharing sites.

Fast-forward to the present, and the matter of RELEASE(DEALLOCATE) versus RELEASE(COMMIT) has again come up for review. The impetus is two-fold: 1) some people who have been using RELEASE(DEALLOCATE) extensively (often for the purpose of reducing the incidence of XES global lock contention in a DB2 data sharing environment) are seeing related incidences of program failures caused by EDM pool-full conditions, and 2) the new global locking protocol introduced with DB2 for z/OS V8 removed the XES-contention-suppression incentive to use the RELEASE(DEALLOCATE) bind option (also covered in the previously-cited blog post).

About EDM pool-full: when a package is executed, it's loaded into DB2's EDM pool (EDM pool space used for sections of packages executed via threads is labeled PT - short for package table - in some monitor displays and reports). If there's not enough space in the pool for the package, the program will fail with a -904 SQL code (resource unavailable, with a code indicating that the unavailable resource is EDM pool space). Such failures due to pool-full conditions are also recorded in the EDM Pool section of a DB2 monitor statistics report or display. These failures tend to be rare, as there are usually enough free (empty) pages in the pool to accommodate new package load requests, and even when free pages are lacking there are typically occupied pages that can be stolen for the new load because the package sections in the occupied pages are associated with packages that were acquired for threads but which were subsequently released. This is where RELEASE(DEALLOCATE) can make a difference, because that bind specification indicates that the package, once acquired for execution via a thread, is not to be released until the thread is deallocated (as opposed to being released at a commit point). If a thread persists for a long time, packages executed via the thread and bound with RELEASE(DEALLOCATE) will be considered by DB2 to be in-use for a long time, and the EDM pool pages occupied by those packages will be non-stealable for a long time. That can lead to a pool-full situation if space in the EDM pool becomes tight.

So, how is it that an organization using RELEASE(DEALLOCATE) can experience a rash of EDM pool-full-related program failures seemingly "out of the blue," only to have the problem go away by itself a short time later? Here's how: RELEASE(DEALLOCATE) works just like RELEASE(COMMIT) when a thread is deallocated at commit time. That's what will happen to a non-protected CICS-DB2 thread if it is not reused, and such a thread will not be reused unless a transaction associated with the same DB2ENTRY CICS resource as the about-to-commit (and complete) transaction is queued up and waiting for a thread. If there are enough threads defined for the DB2ENTRY resource, or if there are enough pool threads and the resource definition indicates that a transaction should overflow to the pool if an entry thread is not available, transactions won't queue waiting for threads, and non-protected threads won't be reused (they'll instead be deallocated, freeing EDM pool space occupied by packages bound with RELEASE(DEALLOCATE) and executed by way of the thread. Folks usually have enough threads defined to prevent transaction queuing, but over time, if the transaction arrival rate increases and the CICS-DB2 thread specifications aren't updated, you could reach a point at which thread re-use suddenly jumps during a peak processing period, and if the re-used (and therefore persistent) CICS-DB2 threads use packages bound with RELEASE(DEALLOCATE), the EDM pool-full situation described above (brought on by a decline in the number of stealable EDM pool pages) can occur.

If this has happened at your site, you have several options available for eliminating the problem:
  • You could increase the size of your EDM pool. With DB2 V8 (and beyond) 64-bit addressing, DB2 storage pools can be a lot larger than they used to be. EDM pools in the range of 200 MB to 400 MB (not counting space used for dynamic statement caching) are no big deal, as far as I'm concerned. DB2 for z/OS V9 makes even more space available for package sections by moving parts of the sections above the 2 GB "bar" in virtual storage (you can read about this in the IBM "red book" titled "DB2 9 for z/OS Performance Topics").
  • You could increase the number of CICS entry and/or pool threads to cut down on thread reuse. A DB2 monitor accounting report or online display of thread detail information will provide you with thread re-use information.
  • You could bind packages with RELEASE(COMMIT) instead of RELEASE(DEALLOCATE).
Regarding that third option: I've already mentioned that there is no longer (as of DB2 V8) a need to use RELEASE(DEALLOCATE) to cut down on XES-type global lock contention. That being the case, should you consider using RELEASE(COMMIT) for all of your DB2 packages? That would certainly be an easy path to take. If you did this, how much more CPU resource would be used by programs that had previously been bound with RELEASE(DEALLOCATE)? Overall, not much, but this will vary by program. If a program issues a lot of SQL statements or even a small number of "big" SQL statements (with respect to CPU consumption) between commits, the CPU-efficiency difference between RELEASE(DEALLOCATE) and RELEASE(COMMIT) is likely to be very small. At the other end, the CPU-efficiency difference could be quite pronounced for a program that issues only one or two very lightweight SQL statements between commits. For most SQL-issuing programs executed via threads that persist across commit operations (such as those associated with batch jobs or protected CICS-DB2 threads), I'd expect to see an increase in CPU time in the low- to mid-single digits of percent following a change to RELEASE(COMMIT) from RELEASE(DEALLOCATE) [keep in mind that when a thread terminates at commit, RELEASE(DEALLOCATE) and RELEASE(COMMIT) are functionally equivalent].

Some folks would conclude that a few extra percentage points of CPU consumption is an acceptable trade for an easily-administered program bind procedure, and they'd go with RELEASE(COMMIT) for everything. Others, in very high-volume environments in which CPU consumption gets a lot of attention, would decide to go for optimal efficiency via RELEASE(DEALLOCATE). Even these folks, however, should use RELEASE(DEALLOCATE) only on a selective basis - I would NOT recommend binding all programs with that specification. The best candidates would be high-use programs executed via persistent threads (again, meaning threads that persist across commits).

Shout-out: I've had many a discussion about RELEASE(DEALLOCATE) versus RELEASE(COMMIT) with my friend Anthony Ciabattoni, lead DB2 for z/OS systems programmer at CheckFree (part of Fiserv Corporation). Anthony will be sharing some of his experience with DB2 data sharing in an IBM Webcast to be broadcast on February 24. Should be a good one, in no small part because Anthony is as energetic as he is knowledgeable. You can get more information at this url: http://www-01.ibm.com/software/os/systemz/webcast/feb24/.

Monday, January 5, 2009

Databases in the clouds?

A few years ago, I put together a presentation in which I included a simplified diagram of an application system. It was an end-to-end depiction, with users (external to the application-owning organization) on the one side, the company's IT infrastructure on the other side, and in between, a cloud, representing the Internet, through which said users interacted with the application.

For years, people have used a cloud in diagrams to represent the Internet - something that is physically implemented on hardware (routers, servers, switches, communication links) that is outside the control of an enterprise but which plays an absolutely vital role in the end-to-end functioning of applications that are the lifeblood of a great many organizations. Lately, the cloud in more and more application system diagrams has shifted: instead of being something that's located between users and a server infrastructure, the cloud is the server infrastructure.

The move to so-called "cloud computing" is a natural progression in the ongoing adoption of Internet technologies. If a business leader is willing to trust the Internet, as a communications conduit, to provide the reliability and security (provided through things such as encryption and virtual private networks) needed to make an application viable, is it such a stretch to entrust the application itself - and its associated data - to a server infrastructure that is owned, operated, and managed by an extra-enterprise entity? In increasing numbers, companies are choosing to go this route.

While the buzz around cloud computing has recently become quite loud, the concept is not all that new. It's been the foundation of the business plan for Salesforce.com (and other such companies) since day one: "Hey, prospective customer. Instead of buying the code for a CRM (customer relationship management) application, why not let us run the application code at our data center, and you just pay for the value delivered by the application service?" Popular Web-based e-mail applications such as Google's Gmail are another example of cloud computing in action. I use one of those e-mail offerings, and the e-mails in my in-basket are data, and that data is not stored on my PC - it's stored at some data center located who-knows-where.

So why all the current buzz around a concept that's been around (and operational) for some time? I believe that cloud computing is much on people's minds these days because companies are now giving serious thought to having applications that they have developed (as opposed to packages they've bought) run in a cloud-provider's server infrastructure (and a fair number of organizations have in fact already pulled that trigger). The appeal of cloud computing is fairly straightforward, and it has a lot to do with saving money. Think about it: cloud computing providers want to sell server and data storage capacity the way utility companies sell electricity. Most folks who can connect their home to a reliable electrical grid would prefer to do so, versus generating the electricity themselves.

Do hurdles remain? Sure. One is performance. If your application is characterized by relatively complex transactions that must complete (from the user's perspective) in less than one second, can an in-the-cloud server infrastructure deliver the goods? What about quality of service? If you have service level agreements (SLAs) with your customers, can a cloud computing provider deliver the quality to you that will enable you to deliver the level of service quality to which you've committed? Then there's data protection. Storing e-mails in the cloud is one thing. How about data pertaining to your customer accounts or to information that you consider to be intellectual property? Can a cloud computing provider lock that down to the extent that you can with your own server infrastructure?

My feeling is that these challenges will be surmounted by cloud computing providers (and indeed, they already have been to the satisfaction of some organizations, at least in regard to particular applications that are already hosted in clouds), leading to accelerated use of cloud computing services. What does that mean to DB2 people? Well, to a lot of DB2 people it won't mean much, because a whole lot of large legacy applications built on DB2 will continue to be hosted by the organizations that developed and own them - newer applications, I believe, are more likely than legacy apps to be thought of as cloud candidates, at least for the next few years. As for those newer applications, it will be interesting to see what cloud computing providers offer, and what potential cloud consumers want. Perhaps some organizations will develop cloud-based apps using non-DBMS-specific data request interfaces (e.g., JDBC) and will let the cloud computing provider make the DBMS choice. In that case, I'm thinking that DB2 should be well positioned as a DBMS-in-the-cloud, given its core strengths with regard to reliability, availability, manageability, and scalability - all traits that should appeal to companies that aim to provide compute power along the lines of a utility (and don't forget DB2's best-of-breed data compression capabilities). To the extent that cloud computing providers choose DB2 (when the DBMS choice is theirs), it could be that a lot of DB2 DBAs and systems programmers end up working for the cloud outfits.

Other organizations might specifically request the use of DB2 in a cloud-based server infrastructure. In such cases, the cloud computing consuming organization might employ its DB2 people pretty much as it does now - it's just that these people would work with servers (probably virtual machines) that are physically located on the premises of another company (and that shouldn't be a big deal - I know lots of DB2 people who work with servers located hundreds of miles away from their work location).

In any case, DB2 folks shouldn't see cloud computing as a threat. Sure, it's a different paradigm, but cloud computing could very well end up being an environment in which DB2 (and the people who work with it) thrive. Change is opportunity, if you're ready for it.

Note: as pointed out by Leon Katsnelson in a recent entry on his FreeDB2 Web site, IBM decided to partner with Morph Labs to make a cloud-hosted DB2 system available to people interested in taking on "The XML Challenge."