Robert's Blog

Wednesday, November 5, 2008

A Couple of Oldie-But-Goodie DB2 Tablespace Questions

Some DB2 for z/OS questions first asked years ago came to be moot due to advances in product technology and functionality. It used to be, for example, that people wanted to know how to reduce index page lock contention, but that was before type 2 indexes (delivered in DB2 V4) eliminated locking on index pages. Similarly, folks used to look for ways to get more than 64 GB of data into a table before that limit went to 16 TB and then to 128 TB (this for a non-LOB table - the size limit is staggeringly large for a table with lots of LOB columns).

On the other hand, there are some questions that were good 15 or more years ago and are still good today. Two that fall into that category were put to me a few weeks ago:
  1. How many tables should be created in a given tablespace?
  2. Which tables in a database should be partitioned?
The first of these questions is, I think, the more interesting of the two. Consider some of the factors that can influence decisions regarding the ratio of tables to tablespaces:
  • Size matters, and sometimes it's all that matters. If a table is going to occupy more than 64 GB of space on disk, the associated tablespace is going to have to be partitioned, and that means that only the one table can be assigned to the tablespace.
  • Production versus non-production databases. Generally speaking, non-production DB2 environments (i.e., those used for test and development purposes) differ from their production counterparts in that they have more tables (there will often be several different versions of each table, in different schemas used at different stages of an application's development) and fewer rows (tables in a performance test database may be close to production-sized, but other non-production tables may have row counts that are 75% or more below production levels). Both these characteristics of non-production DB2 systems - more tables, and smaller tables - make a larger tables-to-tablespaces ratio attractive from a database administration perspective. That's what segmented tablespaces are for.
  • DBD contention, or lack thereof. Every DB2 database (and here I'm using the term "database" in the technical DB2 for z/OS sense, versus the generic notion of a set of logically-related tables) has associated with it a control block (a chunk of system-used information) called a database descriptor, or DBD. Often, we don't think much about DBDs, but some applications are characterized by considerable DDL (data-definition language) activity - related, perhaps, to dynamic view creation for data security purposes - and dynamic (versus static) SQL (particularly common in a data warehouse/business intelligence system). In that kind of environment, one needs to give some thought to the potential for DBD contention associated with concurrently active DDL operations and (sometimes) with dynamic SQL DML (data manipulation language) statements executing at the same time as DDL statements. This might lead you to want more databases (in the DB2 for z/OS technical sense of the word) in your database (generic term). That, in turn, could mean more tablespaces (a given tablespace is associated with one and only one DB2 database, again using the term "database" in the DB2 for z/OS technical sense), and if you have more tablespaces for the same number of tables then of course you'll have fewer tables per tablespace.
  • Utilities. Keep in mind that DB2 utilities tend to operate at the tablespace level. Depending on your situation, that could cause you to want a higher tables-to-tablespaces ratio (nice to back up a bunch of tables by image copying one tablespace) or a lower ratio (you might need the ability to recover an individual table to a prior point in time while leaving data in other tables at currency).
Now, how about the partitioning question? Some things to keep in mind in this regard:
  • Again, size matters. As noted above, you have to partition a table if it's going to hold more than 64 GB of data. That said, partitioning can make lots of sense for tables that are merely large (say, 500 MB or more), as opposed to being huge. Some utilities can operate at the partition level, and it can be beneficial to have the ability to reorganize data (for example) in a large table a partition at a time, rather than REORGing the entire tablespace. The same goes for backup and recovery.
  • Parallelism. While it's not impossible to get some DB2 for z/OS query-splitting activity when tables are not partitioned, parallelism is, to a very significant degree, driven by partitioning: the more partitions you have, the more parallel query processing you're likely to get (assuming that you have packages bound with DEGREE(ANY) and/or you set the value of the CURRENT DEGREE special register to ANY for dynamic SQL statements). Keep in mind that query parallelism isn't just for data warehouse workloads. It can also substantially reduce run times for read-intensive batch jobs running in operational (i.e., non-BI) systems. Also keep in mind that DB2 query parallelism is a great way to utilize the processing capacity of zIIP engines on a mainframe.
More good news on the partitioning front: the new universal tablespace introduced with DB2 for z/OS V9 gives you all the data size and utility-granularity and query-parallelism benefits of traditional partitioned tablespaces, with the quick mass-delete and good "insert-into-the-middle" performance associated with segmented tablespaces.

More or fewer tables per tablespace? Partition or don't partition? Still good questions, these - and they are questions that don't have one-size-fits-all answers. Know your options, think about the needs of your organization (and your own priorities from a database administration perspective), and make the decisions that are right for your environment.


Post a Comment

Subscribe to Post Comments [Atom]

<< Home