Catterall Consulting: DB2 Notes from Warsaw (Day 2)

A little rainy in the morning on this, the second day of the 2008 International DB2 Users Group European Conference, then just some high clouds after that. Following are a few interesting (to me, at least) nuggets of information from my day of technical sessions and hallway conversations (it's been said that some of the most useful knowledge that one gains from attending a conference such as this one comes from the so-called "coffee track"):

The rules of table partitioning really did change a lot with DB2 for z/OS Version 8. Kurt Struyf, a Senior Consultant with Competence Partners in Belgium, delivered a very good presentation on the table-based (versus index-based) partitioning feature that was delivered with DB2 for z/OS Version 8. He went over some of the benefits of table-based partitioning, such as the ability to partition a table's data using one key while clustering data within partitions using another key (with index-based partitioning, the clustering and partitioning indexes are one and the same). He talked about data-partitioned secondary indexes (DPSIs), noting that while they can be good for availability (enabling the avoidance of the BUILD2 phase of online REORG, which in some cases can result in a multi-minute loss of data availability when a subset of a tablespace's partitions are reorganized online), they do not provide an application performance advantage. Kurt also covered the increase in the number of partitions that can be defined for a tablespace (now up to 4096), the dynamic adding of a new partition to an existing partitioned tablespace, and the partition-swap capability that enables one to replace data in a partition so that what had been the first partiton now logically becomes the last (e.g., the partition that had held data for 1997 becomes the receptacle for 2008 data). The presentation also laid out nomeclature changes that came with DB2 V8: a partitioning index now is one that does not control partitioning, but which begins with the column (or columns) of the partitioning key of a table-based partitioned table (and a table-based partitioned table can have several partitioning indexes). Indexes defined on a table-based partitioned table that are not partitioning indexes are called secondary indexes, and these can be partitioned (with entries in separate files that correspond to the table's partitions) or not (Kurt recommended that partitioning indexes be made partitioned indexes, as well). The session concluded with a description of changes in the output of the DISPLAY DATABASE command that support table-based partitioned tables, and with examples of various recovery and reorganization scenarios involving table-based partitioned tables. Kurt's presentation, and others delivered here this week, will be available to IDUG premier-level members on IDUG's Web site (www.idug.org) within a couple of months, and will be available on the site to basic-level members nine months after that.

DB2 for z/OS dynamic statement caching is NOT just about avoiding the CPU cost of statement preparation. I had an interesting between-sessions discussion with Thomas Baumann, a long-time (and VERY knowledgeable) DB2 professional who works for Swiss Mobiliar, a large insurance company. Thomas described how Swiss Mobiliar uses the information in the dynamic statement cache (extracted via EXPLAIN STMTCACHE ALL and EXPLAIN STMTCACHE STMTID) to monitor and tune dynamic SQL statements that run on the Company's production DB2 for z/OS system, and it seemed to me that while dynamic statement caching was initially touted as a way to avoid the CPU cost of re-preparing previously prepared dynamic SQL statements, the primary benefit of the technology to Swiss Mobiliar is in the area of enhanced performance management of the dynamic SQL workload.

.NET is a great application server platform to pair with a DB2 database (whether DB2 for z/OS or DB2 for Linux/UNIX/Windows). Frank Petersen of Bankdata (a large supplier of IT services to 15 Danish banks) gave an excellent presentation on the various ways in which a .NET-based application can work very well together with a DB2 database. He compared and contrasted the Java and .NET application environments (both have their strong points), and described the ways in which a .NET program can access DB2 data (focusing primarily on the the ODBC, ADO.NET, and DB2 .NET data provider interfaces). Going above and beyond a presenter's call of duty, Frank passed out to all session attendees a CD-ROM containing the source of a simple .NET-DB2 application program he had written, thereby providing a great starting point for further exploration of this programming technology. Nick Manu, who works in Belgium for insurance giant AXA, was sitting next to me in this session, and he talked about the tremendous performance results he'd seen in working with DB2 as a back-end database for programs running on Linux/UNIX/Windows-based application servers.

What should DB2 professionals be doing with their time and talents? Jan Henderyckx, leader of the consulting firm BrainWare, delivered a thought-provoking presentation in which he urged DB2 professionals to focus on increasing the value that they deliver to their employing organizations by spending more of their time and creative energy on higher-value data-related activities, as opposed to just administering a DBMS. To this end, Jan suggested that people work to develop a catalog of the database services that should be made available within their respective enterprises, along with risk assessments associated with NOT having those services, together with potential organizational benefits that could be realized if the services were to be established. Among the services promoted by Jan were data governance (including not only data quality but data "lineage," which has to do with where data came from and how it got to where it is), data movement (which includes data replication), metadata management (emphasizing the need for metadata to deliver "actionability" - in other words, to actually be useful to people within the organization), and SOA (with an interesting description of the concept of an enterprise information platform). A challenge mentioned by Jan: getting to a unified approach to data service management and delivery can be complicated by that fact that at many companies various data-related roles are spread across multiple departments within an IT organization. He suggested that virtual teams of data professionals can be effective if the actual organizational structure can't be changed to bring data-related people together within a department.

The DB2 optimizer: 25 years old (more if you count its developmental years before the general availability of DB2) and getting better all the time. Terry Purcell, optimizer guru with the DB2 for z/OS development organization at IBM's Silicon Valley Lab, gave a presentation on recent enhancements to the DB2 for z/OS optimizer. He began by laying out IBM's ongoing priorities as they pertain to optimizer development: 1) improve access path selection (largely a matter of making SQL statement access path cost estimation ever more accurate and comprehensive), 2) provide more access path choices, and 3) provide better access path stability. Terry then reviewed some of the optimizer-related enhancements delivered with DB2 for z/OS V8 (including materialized query tables, improved star-join processing, backward index scans, and the ability to change a table's clustering index key), and some of the V8 SQL enhancements (such as upping the SQL statement size limitation to 2 MB, the ability to GROUP BY an expression, and support for common table expressions and recursive SQL). After that Terry covered DB2 V9 optimizer-related enhancements (examples include plan stability, which is basically the ability to back up and restore access paths; histogram statistics for SQL statement optimzation; and the ability to create an index on an expression), along with V9 SQL-related enhancements (e.g., the MERGE statement, also known as "upsert"; OLAP improvements; and the SQL extensions INTERSECT and EXCEPT). The last part of the presentation covered two of Terry's personal-favorite V9 optimizer enhancements. The first of these is a sort-avoidance technique that can kick in when a statement has a large initial result set and contains an ORDER BY and a FETCH FIRST n ROWS (with "n" being a relatively small value). Rather than sort the large initial result set (say, a million or more rows) just to return the first three rows of the ordered set (if FETCH FIRST 3 ROWS is specified), DB2 V9 will start going through the values in the ORDER BY column, comparing them in good-sized chunks at a time and all the while retaining the three smallest values found (again, assuming a specification of FETCH FIRST 3 ROWS - the number could be any relatively small values). In the end, the rows with the 3 smallest values of the ORDER BY column are identified in a much more efficient manner versus a full-blown sort operation. Terry concluded with a review of the new SYSINDEXSPACESTATS.LASTUSED table, which is populated by the DB2 real-time statistics function and which indicates the last time an index was used (and I mean used for data location or for referential integrity purposes - not just accessed because of a row INSERT or DELETE that drives the insert or deletion of an entry in an index leaf page). This should be especially useful when it comes to figuring out which indexes could be dropped without hurting application performance - if you see that a given index hasn't been used in three months, for example, it might be a good candidate for dropping (which would save disk space and reduce the cost of insert and delete and some update operations targeting the associated table). People with dynamic SQL workloads will especially like this feature, since the SYSPACKDEP catalog table just shows indexes used by static SQL statements.

Tomorrow afternoon I'll deliver my presentation (on what I call ultra-availability), and I'll post another blog entry with things learned and discussed during the day.

Previous Posts

Tuesday, October 14, 2008

DB2 Notes from Warsaw (Day 2)

0 Comments:

Post a Comment