Robert's Blog


Sunday, September 20, 2009

Of DB2 for z/OS Indexes, PartitioneDUH and Otherwise

Last week I taught a DB2 for z/OS database administration class for an organization that fairly recently completed the migration of their production environment to DB2 V8 in Conversion Mode. That being the case, a good bit of class time was spent in discussion of new features and functions introduced with DB2 V8. One topic of particular interest to the class attendees was table-controlled partitioning. Our conversations on this subject became rather amusing, for reasons I'll describe momentarily. First, a little background.

For years and years, there was only one way to partition a mainframe DB2 table: you created a tablespace with the NUMPARTS clause (through which you specified the number of partitions into which a table's rows would be divided), created a table in the tablespace, and created on the table a partitioning index. The index controlled the partitioning of the data, as it was at this level that you defined the partitioning key (comprised of one or more columns) and the limit values for each partition (e.g., with a partitioning key of CUSTOMER_NUMBER, partition 1 might hold rows with a CUSTOMER_NUMBER value of '05000000' or less, while partition 2 would hold rows with CUSTOMER_NUMBER values larger than '0500000' and less than '1000000', and so on).

The table-controlled partitioning feature of DB2 V8 was a big improvement over index-controlled partitioning, largely because it provided users with the ability to partition data with one key while clustering data within partitions by way of another key. With index-controlled partitioning, the partitioning index was also the clustering index, like it or not. With table-controlled partitioning, assignment of rows to partitions based on a key value is a table-specified thing, and clustering is an index-defined thing. So, if you want rows to be divided among partitions by date (perhaps one month per partition), you specify that on the CREATE TABLE statement, and if you want rows within a partition (i.e., for a given month) to be ordered by CUSTOMER_NUMBER, you accomplish that by defining an index, with the CLUSTER clause, on the CUSTOMER_NUMBER column of the table. That kind of set-up is often ideal for a data warehouse database, allowing a query searching across several months to be parallelized by DB2, while providing good locality of reference for the split queries within partitions if a certain customer number or set of customer numbers is targeted.

Table-controlled partitioning also delivered very useful features such as dynamic addition of a partition to a table, and a ROTATE PARTITION FIRST TO LAST option of ALTER TABLE that enables the use of a fixed number of partitions for, say, a rolling 52 weeks of data (every week, data in the "oldest week" partition is purged and archived, and the empty partition is adjusted to receive data for the upcoming week). Still, with all this goodness there comes the task of learning to speak of indexes on table-controlled partitioned tables in a new way. Formerly, a partitioned tablespace had one partitioning index - that being the one by which the partitioning key and partition limit key values were specified. The partitioning index was also the only one on the table that could itself be partitioned (i.e., with index entries spread across index partitions that matched up with the corresponding partitions of the tablespace). With a table-controlled partitioned table, any index that starts with the partitioning key column (or columns) is referred to as a partitioning index, and any index that is defined with the PARTITIONED clause (whether or not it starts with the partitioning key) will have its entries divided among index partitions that correspond to the table's partitions.

Thus, the amusing (to me) discussions in the aforementioned class about indexes defined on table-controlled partitioned tables: to make ourselves clear, we very strongly emphasized the last consonant sound of the type of index about which we were speaking, as in:

DBA: Is a partitioninGUH index necessarily partitioneDUH?

Me: No. If a partitioninGUH index is defined without the partitioneDUH clause, it will not be partitioneDUH.

DBA: Why would someone create a partitioninGUH index without making it partitioneDUH?

Me: Good question. I'd think that you'd always want a partitioninGUH index to be partitioneDUH.

This kind of exchange would crack me up (I'm easily amused). Anyway, the discussions were much appreciated on my part, as an interactive class is a fun class for me. It's also rewarding to help people explore the possible uses of new DB2 features (think about the new flexibility of partitioning in a DB2 V8 or V9 environment, especially in light of the fact that you can now specify up to 4096 partitions for a single table - talk about slicing and dicing). Note, if you're on or moving to DB2 for z/OS V9, that all of this partitioninGUH and partitioneDUH talk is relevant to the new partition-by-range universal tablespaces, but not to partition-by-growth universal tablespaces, as there is no sense of a partitioning key with respect to the latter. I wrote about partition-by-growth tablespaces in a previous entry, and you can check that out if you're interested in learning more about this new type of database object.

That's all for now. Thanks for stopping by the bloGUH.

2 Comments:

Anonymous Anonymous said...

Mr. Catterall , thank so very much for always interesting and educating articles !!!

September 21, 2009 at 7:59 AM  
Blogger Robert Catterall said...

My pleasure - thanks for the positive feedback!

September 21, 2009 at 8:19 PM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home