DB2 for z/OS Physical Database Design: Changing, but Still Important
Last week I gave a presentation at a meeting of SIRDUG, a regional DB2 users group based in Charlotte, North Carolina. Also on the program was David Simpson, a Senior Technical Advisor with Themis Education. David is a reliably excellent presenter, and he certainly didn't disappoint on this occasion. His presentation topic, "Physical Database Design for the 21rst Century," was of particular interest to me, and it made me think of the big changes that have occurred in the area of mainframe DB2 physical database design since the Version 1 days back in the 1980s. There have been times during my 22 years of DB2-related work when I actually fretted about some product enhancements, worrying that physical database design - something on which I'd spent a lot of time - would become relatively unimportant, and wondering if people's need for the DB2 help that I could provide would diminish as a result. One of those worry-times came when type 2 indexes came along with DB2 Version 4. No more index-level locking? No more need for my help in avoiding index lock contention! I also spent plenty of time, back in the day, painstakingly placing DB2 tablespace and index data sets on specific disk volumes to minimize I/O contention. That stopped when people started associating these data sets with DB2 STOGROUPs defined with VOLUMES('*') - effectively letting the operating system handle data set placement.
Thankfully, my concerns were misplaced, for as DB2 for z/OS enhancements eliminated some physical database design challenges that had formerly consumed much of my time, other new features created new opportunities to optimize database performance and availability at the physical level. David highlighted a number of these modern DB2 physical database design-related features in the course of his presentation. One example: the elimination in DB2 V9 of the "BUILD2" phase of online REORG. In prior releases of DB2, an online REORG of a subset of a partitioned tablespace's partitions included a phase, called BUILD2, in which RIDs (row IDs) of relocated rows (this as a result of the REORG) were updated accordingly in non-partitioning indexes defined on the table. During BUILD2, the affected index entries are unavailable, and this had the effect of making data in the REORGed partitions unavailable with respect to operations (such as INSERT) that would modify index entries, and operations involving access to table rows by way of the index entries. BUILD2 made online REORG somewhat less than ideally online for some DB2-using organizations. With DB2 V9, BUILD2 goes away because a REORG of even a subset of a partitioned tablespace's partitions will result in a REORG of all the underlying table's non-partitioned indexes in their entirety. This change, while definitely a boost for data availability, will cause DBAs to think again about the way they index partitioned tables and how they manage the organization of data in those tables.
David also talked about the variety of table-partitioning schemes available in the DB2 for z/OS V9 environment, reminding us that not so long ago, one spoke simply of a partitioned table, with no need to qualify the word "partitioned." With DB2 V9, if you mention a partitioned table, I'll ask you to be more specific. Do you mean an index-controlled range-partitioned table? A table-controlled range-partitioned table? A partition-by-range table in a universal tablespace? A partition-by-growth table (another flavor of universal tablespace)? David made a very interesting point about partition-by-growth (PBG) tables (about which I blogged last week): would that not be a good choice even for small tables? I've never thought of partitioning a small table before, but David's question was a very good one. What's the down side to having a small PBG table that ends up occupying only one partition.
The index compression feature of DB2 V9 is another of the new physical database design considerations that David covered in his presentation. Unlike tablespace compression, index compression is not dictionary-based. Instead, it functions on the basis of DB2's ability to squeeze index entries from 8K, 16K, or 32K index pages (these index page sizes are also new with DB2 V9) into 4K pages on disk. To help DBAs make the best choice of index page size for the purpose of index compression, the DB2 utility DSN1COMP has been enhanced to provide estimates of index compression space savings based on different page sizes.
David covered the new DB2 V9 index-on-expression capability, as well. This feature can be used to make a predicate like WHERE UPPER (LASTNAME, 'En_US') = 'SMITH' stage 1 and indexable: you just create an index on UPPER(LASTNAME, 'En_US') on the table containing the LASTNAME column. Voila: names are stored in mixed case in the column, but they can be efficiently searched using an all-uppercase comparison value.
And there was more: XML tablespaces, reordered row format, histogram statistics. I recommend that you attend David's presentation at the IDUG North American Conference in May if you want to get the whole scoop. My main take-away was this: while new DB2 features and functions continue to make formerly time-consuming challenges non-issues, in no way does this cause a DBA's role to have ever-diminishing importance. Quite the contrary: a modern DB2 DBA can deliver more value to an organization - even in a seemingly "old-school" area such as physical database design. Recent product enhancements will enable companies to reach news heights of DB2 availability, scalability, and performance, but realizing these potential benefits will depend on a DBA's ability to recognize opportunities and to effectively put new DB2 capabilities to work. Knowledge that was once valuable to a DBA may be unimportant in a contemporary context, but there are always new things to learn, and new opportunities to turn DB2 knowledge into business value. That, more than anything, is what I like about this business.
Thankfully, my concerns were misplaced, for as DB2 for z/OS enhancements eliminated some physical database design challenges that had formerly consumed much of my time, other new features created new opportunities to optimize database performance and availability at the physical level. David highlighted a number of these modern DB2 physical database design-related features in the course of his presentation. One example: the elimination in DB2 V9 of the "BUILD2" phase of online REORG. In prior releases of DB2, an online REORG of a subset of a partitioned tablespace's partitions included a phase, called BUILD2, in which RIDs (row IDs) of relocated rows (this as a result of the REORG) were updated accordingly in non-partitioning indexes defined on the table. During BUILD2, the affected index entries are unavailable, and this had the effect of making data in the REORGed partitions unavailable with respect to operations (such as INSERT) that would modify index entries, and operations involving access to table rows by way of the index entries. BUILD2 made online REORG somewhat less than ideally online for some DB2-using organizations. With DB2 V9, BUILD2 goes away because a REORG of even a subset of a partitioned tablespace's partitions will result in a REORG of all the underlying table's non-partitioned indexes in their entirety. This change, while definitely a boost for data availability, will cause DBAs to think again about the way they index partitioned tables and how they manage the organization of data in those tables.
David also talked about the variety of table-partitioning schemes available in the DB2 for z/OS V9 environment, reminding us that not so long ago, one spoke simply of a partitioned table, with no need to qualify the word "partitioned." With DB2 V9, if you mention a partitioned table, I'll ask you to be more specific. Do you mean an index-controlled range-partitioned table? A table-controlled range-partitioned table? A partition-by-range table in a universal tablespace? A partition-by-growth table (another flavor of universal tablespace)? David made a very interesting point about partition-by-growth (PBG) tables (about which I blogged last week): would that not be a good choice even for small tables? I've never thought of partitioning a small table before, but David's question was a very good one. What's the down side to having a small PBG table that ends up occupying only one partition.
The index compression feature of DB2 V9 is another of the new physical database design considerations that David covered in his presentation. Unlike tablespace compression, index compression is not dictionary-based. Instead, it functions on the basis of DB2's ability to squeeze index entries from 8K, 16K, or 32K index pages (these index page sizes are also new with DB2 V9) into 4K pages on disk. To help DBAs make the best choice of index page size for the purpose of index compression, the DB2 utility DSN1COMP has been enhanced to provide estimates of index compression space savings based on different page sizes.
David covered the new DB2 V9 index-on-expression capability, as well. This feature can be used to make a predicate like WHERE UPPER (LASTNAME, 'En_US') = 'SMITH' stage 1 and indexable: you just create an index on UPPER(LASTNAME, 'En_US') on the table containing the LASTNAME column. Voila: names are stored in mixed case in the column, but they can be efficiently searched using an all-uppercase comparison value.
And there was more: XML tablespaces, reordered row format, histogram statistics. I recommend that you attend David's presentation at the IDUG North American Conference in May if you want to get the whole scoop. My main take-away was this: while new DB2 features and functions continue to make formerly time-consuming challenges non-issues, in no way does this cause a DBA's role to have ever-diminishing importance. Quite the contrary: a modern DB2 DBA can deliver more value to an organization - even in a seemingly "old-school" area such as physical database design. Recent product enhancements will enable companies to reach news heights of DB2 availability, scalability, and performance, but realizing these potential benefits will depend on a DBA's ability to recognize opportunities and to effectively put new DB2 capabilities to work. Knowledge that was once valuable to a DBA may be unimportant in a contemporary context, but there are always new things to learn, and new opportunities to turn DB2 knowledge into business value. That, more than anything, is what I like about this business.
3 Comments:
Robert,
Good article. This is at topic that I have been struggling with quite a bit lately, especially in regards to XML data. Non-xml partitioned tblespaces allowed you to assign partitions to specific volumes thru a stogroup, allowing you to move 'less active' data to slower devices. The ability to alter the stogroup on a partition along with IDCAMS commands made this a relatively quick process. With XML tspace limitations, this doesn't appear to be a viable option for xml data( or atleast not an easy one), yet the size and importance of xml data still make it an important need. Have you seen any good writeups on this topic or maybe a place you can point me to?
- Roland.DePratti@cigna.com
Thanks for the positive feedback on the post, Roland (and apologies for the delayed response).
The DB2 9 for z/OS XML Guide indicates, in Chapter 2, on page 21, that you can alter the STOGROUP attribute of the XML tablespace that is implicitly created when you create a table containing an XML column (you can access this manual in HTML or PDF form at the "DB2 Library" page on IBM's Web site: http://www-01.ibm.com/support/docview.wss?rs=64&uid=swg27011656).
If the implicitly-created XML tablespace is a partition-by-range universal tablespace, as it will be if the base table is in either a "classic" partitioned tablespace or a partition-by-range universal tablespace, you should be able to alter the STOGROUP attribute at the partition level.
nice post
Post a Comment
Subscribe to Post Comments [Atom]
<< Home