DB2 for z/OS Physical Database Design: Changing, but Still Important
Last week I gave a presentation at a meeting of SIRDUG, a regional DB2 users group based in Charlotte, North Carolina. Also on the program was David Simpson, a Senior Technical Advisor with Themis Education. David is a reliably excellent presenter, and he certainly didn't disappoint on this occasion. His presentation topic, "Physical Database Design for the 21rst Century," was of particular interest to me, and it made me think of the big changes that have occurred in the area of mainframe DB2 physical database design since the Version 1 days back in the 1980s. There have been times during my 22 years of DB2-related work when I actually fretted about some product enhancements, worrying that physical database design - something on which I'd spent a lot of time - would become relatively unimportant, and wondering if people's need for the DB2 help that I could provide would diminish as a result. One of those worry-times came when type 2 indexes came along with DB2 Version 4. No more index-level locking? No more need for my help in avoiding index lock contention! I also spent plenty of time, back in the day, painstakingly placing DB2 tablespace and index data sets on specific disk volumes to minimize I/O contention. That stopped when people started associating these data sets with DB2 STOGROUPs defined with VOLUMES('*') - effectively letting the operating system handle data set placement.
Thankfully, my concerns were misplaced, for as DB2 for z/OS enhancements eliminated some physical database design challenges that had formerly consumed much of my time, other new features created new opportunities to optimize database performance and availability at the physical level. David highlighted a number of these modern DB2 physical database design-related features in the course of his presentation. One example: the elimination in DB2 V9 of the "BUILD2" phase of online REORG. In prior releases of DB2, an online REORG of a subset of a partitioned tablespace's partitions included a phase, called BUILD2, in which RIDs (row IDs) of relocated rows (this as a result of the REORG) were updated accordingly in non-partitioning indexes defined on the table. During BUILD2, the affected index entries are unavailable, and this had the effect of making data in the REORGed partitions unavailable with respect to operations (such as INSERT) that would modify index entries, and operations involving access to table rows by way of the index entries. BUILD2 made online REORG somewhat less than ideally online for some DB2-using organizations. With DB2 V9, BUILD2 goes away because a REORG of even a subset of a partitioned tablespace's partitions will result in a REORG of all the underlying table's non-partitioned indexes in their entirety. This change, while definitely a boost for data availability, will cause DBAs to think again about the way they index partitioned tables and how they manage the organization of data in those tables.
David also talked about the variety of table-partitioning schemes available in the DB2 for z/OS V9 environment, reminding us that not so long ago, one spoke simply of a partitioned table, with no need to qualify the word "partitioned." With DB2 V9, if you mention a partitioned table, I'll ask you to be more specific. Do you mean an index-controlled range-partitioned table? A table-controlled range-partitioned table? A partition-by-range table in a universal tablespace? A partition-by-growth table (another flavor of universal tablespace)? David made a very interesting point about partition-by-growth (PBG) tables (about which I blogged last week): would that not be a good choice even for small tables? I've never thought of partitioning a small table before, but David's question was a very good one. What's the down side to having a small PBG table that ends up occupying only one partition.
The index compression feature of DB2 V9 is another of the new physical database design considerations that David covered in his presentation. Unlike tablespace compression, index compression is not dictionary-based. Instead, it functions on the basis of DB2's ability to squeeze index entries from 8K, 16K, or 32K index pages (these index page sizes are also new with DB2 V9) into 4K pages on disk. To help DBAs make the best choice of index page size for the purpose of index compression, the DB2 utility DSN1COMP has been enhanced to provide estimates of index compression space savings based on different page sizes.
David covered the new DB2 V9 index-on-expression capability, as well. This feature can be used to make a predicate like WHERE UPPER (LASTNAME, 'En_US') = 'SMITH' stage 1 and indexable: you just create an index on UPPER(LASTNAME, 'En_US') on the table containing the LASTNAME column. Voila: names are stored in mixed case in the column, but they can be efficiently searched using an all-uppercase comparison value.
And there was more: XML tablespaces, reordered row format, histogram statistics. I recommend that you attend David's presentation at the IDUG North American Conference in May if you want to get the whole scoop. My main take-away was this: while new DB2 features and functions continue to make formerly time-consuming challenges non-issues, in no way does this cause a DBA's role to have ever-diminishing importance. Quite the contrary: a modern DB2 DBA can deliver more value to an organization - even in a seemingly "old-school" area such as physical database design. Recent product enhancements will enable companies to reach news heights of DB2 availability, scalability, and performance, but realizing these potential benefits will depend on a DBA's ability to recognize opportunities and to effectively put new DB2 capabilities to work. Knowledge that was once valuable to a DBA may be unimportant in a contemporary context, but there are always new things to learn, and new opportunities to turn DB2 knowledge into business value. That, more than anything, is what I like about this business.
DB2 Partition-By-Growth Tables: Very Nice, But Not For Everything
Not long ago, I was doing some work at a site that had DB2 for z/OS V8 running in production (as is true, I think, of most mainframe DB2-using enterprises at present - with a tip of the hat to those organizations that have installed or migrated to V9). A DBA came by with an idea for table partitioning that he wanted to bounce off me. What he wanted to achieve by way of this scheme was minimization of long-term administrative hassle for a large table. His proposal: define the table with an identity column (with the default start-at-1 and increment-by-1 options in effect), and specify that identity column as the partitioning key (this would utilize the table-controlled - as opposed to index-controlled - partitioning capability introduced with DB2 V8). He would decide ahead of time how many rows he wanted in a partition and specify partition limit key values accordingly. He'd set up an initial number of partitions for the table (enough for several weeks' worth of inserts, or maybe a few months), and he would periodically add more partitions (with progressively greater limit key values), via ALTER TABLE ADD PARTITION, to stay well ahead of the advancing insert activity. He even talked of automating this periodic partition-add process. This proposed set-up would allow the table to become very large (resting easy with the knowledge that the table could have up to 4096 partitions, and that the identity column/partitioning key value could go up to one less than 10 to the 31rst power), with populated partitions of about equal size (good), and with the benefit of partition-level utility operations - all in all, about as close to a set-it-and-forget-it approach as you could have for a big table in a DB2 V8 environment.
My initial response to the idea was, "Too bad you guys aren't yet on DB2 Version 9. Partition-by-growth would take care of all that for you." The DBA's eyes widened a little. "Partition by what?" he asked. I went on to explain that he was not alone in wanting a really easy way (administratively speaking) to handle really big tables, and that IBM had responded to that recognized need with a new type of table organization that reduces DBA oversight requirements even more than would his partition-by-identity plan: you just tell DB2 how large you want a table's partitions to be (via the DSSIZE specification for the associated tablespace), and that's that. There's one partition to begin with, and when that one gets full DB2 defines another one and starts using it, and this continues as needed to accommodate INSERT and/or LOAD activity (the MAXPARTITIONS value for the tablespace sets the limit on the number of partitions that can be defined for the table). No muss, no fuss - you get an object that's as easy to manage as a segmented tablespace (with the favorable space management characteristics of segmented, since a partition-by-growth table uses the new universal tablespace technology delivered with DB2 V9), with the advantage of very large size (a segmented tablespace can't grow beyond 64 GB) and partition-level utility operations (except for LOAD, which has to run at the tablespace level for a partition-by-growth table).
So, plenty of DB2 for z/OS DBAs are keen (or will be) to use partition-by-growth (PBG) tables. I don't want to rain on that parade, but you should consider that PBG is not the right solution for every large-table situation. In particular, in a data warehouse environment or otherwise when a large table's rows will be often retrieved in large bunches (versus singly or in small sets, as is often true for online transaction processing systems), a goal of performance optimization will likely lead toward the use of partition-by-range (the other flavor of DB2 V9 universal tablespace) for large tables. Here's why:
So, by all means, look for opportunities to make good use of DB2 V9 partition-by-growth tables, but keep in mind that there could be situations in which you'd want to expend more more physical database design and administration effort in return for query performance improvements and efficiency gains. In such cases, partition-by-range will likely be your partitioning solution of choice. Simple is good, but more sophisticated approaches to large-table management have their place in the DB2 DBA tool kit, as well.
- Partition-by-range (PBR) tables enable multi-dimensional clustering. Yes, I know - multidimensional clustering is, in the strictest sense of the term, a feature of DB2 for Linux, UNIX, and Windows (LUW); however, a similar effect can be achieved with PBR tables (and indeed, with DB2 for z/OS V8 table-controlled partitioning) by partitioning on one key (i.e., one data dimension) and clustering on another. So, for example, a table could be partitioned by date and clustered within partitions by customer ID. This type of arrangement can lead to both query parallelism (a user might want data for a given customer ID over the past 6 months, and if each partition holds data for a particular week, the target rows will be spread over 20-plus partitions) and locality of reference (within partitions, the rows for the specified customer ID will be located near each other). That can be a powerful performance combination, and it's not possible with PBG tables because for such tables there is no notion of a partitioning key.
- PBR tables allow for data rotation. Suppose you want to keep a year's worth of data in a table, and you want to partition that data by week, but you don't want more than 53 partitions (we tend to think of a year as being 52 weeks, but of course fifty-two 7-day periods doesn't quite get you there). With a PBR table (and again, with DB2 V8 table-controlled partitioning), you can use actual date values as partition key limits and keep a rolling 53 weeks of data by taking advantage of ALTER TABLE ROTATE PARTITION functionality. Again, can't do that with PBG.
- PBR tables enable page-range screening. When a query predicate references the limit key of a PBR table, DB2 can avoid accessing partitions that would not contain qualifying rows - a potentially huge performance benefit for SELECTs that target large tables (page-range screening, also known as limited partition scan, has been around for a while, but DB2 for z/OS V9 further leverages this optimization technique by extending its use to table join predicates and to non-matching page-range predicates).
DB2 for z/OS: Static SQL and the EDM Pool
A lot of people responsible for the management of DB2 for z/OS systems regularly look over DB2 monitor statistics reports or online displays to verify that all is well or to identify (and subsequently resolve) anything that is amiss. This is good. Some of these folks are a little confused by what they see in the part in these reports pertaining to the EDM pool (long name: environmental descriptor manager pool), particularly the fields related to static SQL execution. This is not good. If you're going to make the best use of DB2 statistics trace data (nicely formatted by your monitor product), you need to understand the activity behind the numbers. If SKPTs and CTs and PTs and related hit ratios seem rather opaque to you, relax - you're not alone. I hope that the information provided in this blog entry will clear things up for you.
[Note: for consistency's sake, when referring to a program bound directly into a plan I'm going to assume that said plan was bound with ACQUIRE(USE) as opposed to ACQUIRE(ALLOCATE). ACQUIRE(USE) is the default for BIND PLAN. ACQUIRE is not an option for BIND PACKAGE - all packages function as though bound with ACQUIRE(USE).]
OK, so you start with a program that contains embedded (aka static) SQL statements. You precompile this program, and one of the outputs of that operation is a DBRM (database request module). For the program to function properly (from a DB2 perspective), the SQL statements contained in the DBRM have to be turned into a "control structure" that DB2 will use at statement execution time. This is accomplished by way of the BIND command. If the DBRM is bound directly into a DB2 plan (quite uncommon these days), the aforementioned control structure will be placed in a table in the DB2 directory (database DSNDB01) called the skeleton cursor table. If the DBRM is bound into a package (the norm), the resultant control structure goes into the skeleton package table, also in the directory.
The static SQL-containing program, having been precompiled, is subsequently compiled and linked and is ready to run. At execution time, it "issues" (in effect) an SQL statement by making a call to DB2 which includes the program name and a statement number and an identifier called a consistency token (these calls are inserted into the source code at precompile time, and the consistency token is also inserted into the DBRM and the bind-produced control structure). DB2 will respond to this call by looking in the EDM pool in the database services address space (aka DBM1) to see if the requisite sections of the control structure are there (some control structure sections contain header and directory information, and other sections correspond to SQL statements to be executed). Where DB2 looks is in the part of the EDM pool used for SKPTs (skeleton package table sections, if we're talking about a package) or SKCTs (skeleton cursor table sections, for programs bound directly into a plan). If the needed section or sections are not found, they are loaded into the EDM pool from the DB2 directory (via buffer pool BP0). In addition, copies of the sections are placed in the part of the EDM pool used for PTs (package table sections) or CTs (cursor table sections). Basically, the PT part of the EDM pool holds copies of control structure sections that are in the SKPT part of the EDM pool, with the PT copies being specific to a thread through which the static SQL-containing program is executing (the thread is the program's connection to the DB2 for z/OS subsystem). This also describes the relationship between the SKCT and CT parts of the EDM pool.
DB2 uses 4-KB chunks of EDM pool space for package and plan control structure sections. When a static SQL-containing program commits a unit of work, PT or CT pages that it has used are released in the EDM pool, thereby making them stealable, if the package or plan was bound with RELEASE(COMMIT). If the package or plan was bound with RELEASE(DEALLOCATE), the associated PT or CT pages in the EDM pool will remain non-stealable until the thread through which the program was executed is deallocated (i.e., terminated). An SKPT or SKCT page in the EDM pool will become stealable when a DB2-maintained "use count" for the SKPT or SKCT section in the page reaches zero (this use count is decremented each time the section is copied to the PT or CT part of the EDM pool on behalf of a thread).
On now to the DB2 monitor statistics report (and your monitor probably allows you to view the same information in an online display of subsystem-level information). You will see in the EDM pool section of the report an indication of the number of pages in the pool that hold SKPT and SKCT sections, and the number of pages that hold thread-related copies of package table (PT) and cursor table (CT) sections (DBDs, or database descriptors, also go into the EDM pool, but starting with DB2 for z/OS V8 these are loaded into a separate part of the pool above the 2 GB "bar" in the DBM1 address space). Additionally, you will see the number of free pages in the pool. These are pages that weren't holding anything when the data on which the report (or online display) is based was generated. [This and the other values mentioned thus far are "snapshot" (i.e., point-in-time) values.] Generally speaking, you shouldn't worry too much if the number of free pages is a relatively small percentage of the pages in the EDM pool (e.g., 10-20%), because in most cases a majority of the "in-use" pages are stealable (note that if your monitor reports a percentage of non-stealable pages in use, this probably applies only to non-stealable pages that hold PT or CT sections, as opposed to non-stealable SKPT and SKCT sections).
A little further down in the report, you'll see the hit ratios that pertain to requests for package and cursor table sections. A "not found" situation occurs when a section of a package or plan needed for the execution of a static SQL statement embedded in a running program is not found in the part of the EDM pool that holds sections of SKPTs (for packages) or SKCTs (for programs bound directly into plans). When that happens, the required section is loaded into the SKPT or SKCT part of the EDM pool from the skeleton package table or skeleton cursor table in the DB2 directory. That section is also copied to the PT or CT part of the EDM pool for the requesting thread (that being the thread through which the program is being executed). Because most programs at most DB2 for z/OS sites are bound into packages as opposed to being bound directly into plans, it is likely that you will see many more PT requests than CT requests (and more pages "held by PT" versus "held by CT"). Unless your EDM pool is pretty small (and these days, I tend to see pools that have 100-200 MB or more of space for SKPTs, SKCTs, PTs and CTs), your PT hit ratio (PT requests minus PT not found occurrences, divided by PT requests) will probably be very high (98% or more is not unusual). That's good (throughput is slightly impeded when package or plan sections have to be read into the EDM pool from the directory on disk). If the volume of CT requests on your system is low, you might see a relatively low CT hit ratio (e.g., under 70%). Is that bad? Probably not. Obviously, the first time a static SQL-containing program is executed after a DB2 start-up, a PT or CT "not found" situation will result because the needed section or sections have not yet been placed in the SKPT or SKCT part of the EDM pool (they're loaded on-demand). If programs bound directly into plans are rarely executed, there will be little in the way of repeated requests for CT sections that have previously been loaded into the SKCT part of the EDM pool, and since those "re-requests" are what drive a hit ratio higher, the CT hit ratio might be low. If it is low, don't worry about it if the CT request activity level is small compared to the number of PT requests on your system.
Finally, a word on "FAILS DUE TO POOL FULL." This happens when a requested package or plan section cannot be loaded into the EDM pool because the pool was full (the result is a program abend). That means no free pages and no stealable in-use pages. This tends to be quite a rare occurrence. When it does happen, it can caused by packages bound with RELEASE(DEALLOCATE) being executed via threads that persist for a long time before terminating. CICS-DB2 protected threads can be of this variety, persisting for hours before terminating thanks to thread re-use. If not many programs are bound with RELEASE(DEALLOCATE), the impact on the number of non-stealable PT (package section-holding) pages shouldn't be too significant. If lots of packages are bound with RELEASE(DEALLOCATE) and they are executed by threads that stick around for a long time, the part of the EDM pool used for PT sections can grow substantially, and lot of this space could be non-stealable. You can use RELEASE(DEALLOCATE) when binding programs (it provides a CPU efficiency benefit, especially for "small" programs that are executed by way of persistent threads), but try not to overdo it: the best candidates are programs that are frequently executed and are associated with threads that persist through commits. You can find out more about RELEASE(DEALLOCATE) versus RELEASE(COMMIT) in a blog entry I posted a few weeks ago.
I hope now that SKPT and SKCT and PT and CT are not abstract letter-combinations for you. Knowing what they're all about, you can be that much better at monitoring the health of your DB2 for z/OS subsystem.
Another Advantage of DB2 for z/OS V9 Native SQL Procedures
In an entry posted to this blog last November, I wrote about some of the benefits of using native SQL procedures (which one can create and deploy with DB2 for z/OS V9 in New Function Mode) versus external SQL procedures (these get turned into C language executables under the covers via the program preparation process). In particular, I focused on reduced computing costs (native SQL procedures are more CPU-efficient than external SQL procedures, and they can utilize zIIP engine cycles when invoked through the DB2 Distributed Data Facility) and enhanced functionality (native SQL procedures support nested compound statements which, among other things, can enable more-sophisticated condition-handling logic). Recently I've been impressed by yet another advantage of native SQL procedures: they are easier to create and manage than their external counterparts (though there is an "eye of the beholder" aspect of this assessment, as I'll explain momentarily).
Because an external SQL procedure will end up becoming (as previously mentioned) a C language program, resources outside of DB2 are involved in the program preparation process. At many mainframe DB2 sites, people use DSNTPSMP, a REXX language stored procedure that comes with DB2, to prepare external SQL procedures for execution. DSNTPSMP handles the necessary create, precompile, compile, link edit, and bind tasks, and doing those things requires access to work file, source code, DBRM, and load module data sets that are a) outside of DB2 and b) protected by RACF (or by an equivalent mainframe-based security management product). By design, DSNTPSMP is created (as part of a DB2 subsystem installation or migration) with a SECURITY specification of USER (SECURITY, one of the options on a CREATE PROCEDURE statement, indicates how access by the stored procedure to resources outside of DB2 is to be handled). This means that the authorization ID of a user who invokes DSNTPSMP has to be permitted by RACF (or equivalent) to access the aforementioned non-DB2 resources. Users invoking DSNTPSMP must also have the right level of DB2 authorization, including the EXECUTE privilege on DSNTPSMP and access privileges to some DB2 catalog tables (e.g., SELECT on SYSROUTINES and SELECT, INSERT, UPDATE, and DELETE on SYSROUTINES_SRC). Managing these authorizations can be simplified through the use of secondary authorization IDs (i.e., granting privileges to a RACF - or equivalent - group and then connecting individual IDs to said group), but for secondary auth IDs to work correctly any and all libraries containing DB2 authorization exits have to be identified in STEPLIB in the JCL of the WLM environment in which DSNTPSMP runs, and sometimes people overlook that.
The rather involved nature of external SQL procedure preparation is in contrast to the straightforward preparation of a native SQL procedure. External (to DB2) resources are not required to prepare or to execute a native SQL procedure. No load libraries: the prepared procedure is just a package (i.e., a DB2 "runtime structure" based on the SQL statements to be executed), and it's stored in the DB2 directory. No WLM application execution environment: when a native SQL procedure is called it executes in the DB2 database services address space (aka DBM1). This lack of external dependencies makes lifecycle processes simpler for native SQL procedures relative to external procedures, and in my book simple is generally good.
Now, about the "eye of the beholder" aspect of my characterization of lifecycle processes (create, maintain, etc.) as being simpler for native SQL procedures than for external SQL procedures: it's true that external procedure preparation and management is more complex for external SQL procedures, but once these processes and the attendant infrastructure are established, the environment should be stable and straightforward from an administrative perspective - as many users of DB2 for z/OS external SQL procedures will attest. I would encourage people who are currently using external SQL procedures to use native SQL procedures once they have DB2 for z/OS V9 running in New Function Mode, but that should be a deliberate move as opposed to a pell-mell rush. I like the approach advocated by Tom Miller, Senior Software Engineer at IBM's Silicon Valley Lab and a leader in the area of DB2 for z/OS stored procedure functionality. Tom's advice for current users of external SQL procedures who are looking to use DB2 V9 native SQL procedures:
External SQL procedures are good. Native SQL procedures are better. Users of the former should prepare to upgrade to the latter. Your investment of time in planning and testing will set the stage for SQL procedures that are more efficient, more flexible, and easier to manage.
- Get familiar and comfortable with the different lifecycle processes associated with internal versus external SQL procedures.
- Consider converting external SQL procedures to native SQL procedures when upgrading existing DB2 applications.
- As appropriate, choose native SQL procedures for new DB2 application development.