Catterall Consulting: DB2 9 for z/OS: Rx for Database Design Headaches

I spent most of this past week teaching a DB2 9 for z/OS Transition class. In such a situation, it's fun to get other people's take on the new features delivered with this latest release of DB2 on the mainframe platform. One of the students had an interesting comment regarding the ability to create an index on a key expression (that being an expression that reference's at least one of a table's columns and which returns a scalar value): "This is going to help me deal with some application performance problems that are database design-related." He went on to tell a tale familiar to many experienced DB2 DBAs: an application was migrated to DB2 from a non-relational database management system, but in moving the data there was no attempt to redesign the database to take advantage of relational technology. Instead, the records in the legacy database files were just plopped into DB2 tables that were designed according to the record layouts of the old system. Among other things, this led to a number of situations in which predicates of application program queries had to be coded with scalar functions in order to generate the required result sets. In particular, there were plenty of predicates of the form:

WHERE SUBSTR(COL, start, length) = 'some value'

Oops. That's a non-indexable predicate in a DB2 V8 environment. What are you going to do about that? Ask end users to endure tablespace scans? Store the predicate-referenced data twice in the target table - once in its original column form and again in a column that contains the desired substring information? That second option's no fun: more disk space consumption, and required modification of inserting programs and programs that update the column in question (or creation of triggers to maintain the "substring" column, knowing that the triggered actions will increase overhead for SQL statements that cause a trigger to fire).

Enter DB2 9. When you're running this DB2 release in New Function Mode, you can create an index on an expression. That index might look like this:

CREATE INDEX SUBSTRIX
ON TABLE_ABC
(SUBSTR(COLn, 2, 5))
USING STOGROUP XYZ
PRIQTY 64
SECQTY 64
BUFFERPOOL BP1;

And guess what? When a query comes along with a predicate of the form:

WHERE SUBSTR(COLn, 2, 5) = 'some value'

That predicate is now stage 1 and indexable. Now, there's no free lunch. Indexes defined on expressions will cause overhead to increase somewhat for insert operations and for updates that would change an expression-generated key value. Some utilities, such as LOAD and REBUILD INDEX, will also end up consuming a little more CPU time. Still, with the potential for orders-of-magnitude performance improvement for some queries containing predicates that had previously been non-indexable, the index-on-expression feature of DB2 9 for z/OS is, I think, going to end up being a big draw for many shops.

Another DB2 9 feature that can help ease the pain of database design-related problems is the new "instead-of" trigger category (added to the existing UPDATE, INSERT and DELETE trigger types). You may have a situation in which a view has been created to make life easier for programmers who have to code SELECTs for certain data retrieval operations. Trouble is, that view might be read-only - it might, for example, be based on a join of two or more tables. What then? Do you tell the programmers that they should target the view for reads, and the underlying tables for data-change operations? That would certainly fly in the face of the "make life easier" rationale behind the creation of the view. Should you dispense with the view and denormalize the database design to provide a single-table SELECT-result that matches what one can get from the view? Sure, if you want to increase disk space consumption, change update/insert/delete SQL statements accordingly, and decrease flexibility with respect to the design of future applications that might need to access the database.

A better solution: go to DB2 9 (if you're not already there), get to New Function Mode, and define INSTEAD OF triggers that will enable programmers to both read from, and change data in, the view that had formerly been read-only.

Here's an illustrative example of what I'm talking about: suppose you have an EMPLOYEE table and a DEPARTMENT table. They both have a DEPTNO column, but only the DEPARTMENT table contains department names (as should be the case for a third-normal-form database design). If you want to make it really easy for programmers to retrieve department names along with department numbers for employees, you can create a view based on a join of the EMPLOYEE and DEPARTMENT tables; however, the resulting view would be read-only absent an INSTEAD OF UPDATE trigger on the view.

Here's what I mean:

CREATE VIEW EMP_DEPT
AS SELECT E.EMPNO, E.FIRSTNME, E.LASTNAME, E.DEPTNO, D.DEPTNAME
FROM EMPLOYEE E, DEPTARTMENT D
WHERE E.DEPTNO = D.DEPTNO;

If you subsequently issue the following SQL statement to change the first name of employee 000100 to "CHUCK" (perhaps from "CHARLES")

UPDATE EMP_DEPT
SET FIRSTNME = 'CHUCK'
WHERE EMPNO = '123789';

You'll get a -151 SQL error code because the view EMP_DEPT is read-only (thanks to the fact that the SELECT defining the view has more than one table in the FROM-list.

You can take care of this problem with an INSTEAD OF UPDATE trigger like this one:

CREATE TRIGGER OK_UPDTE INSTEAD OF UPDATE
ON EMP_DEPT
REFERENCING NEW AS N OLD AS O
FOR EACH ROW MODE DB2SQL
UPDATE EMPLOYEE E
SET E.EMPNO = N.EMPNO,
E.FIRSTNME = N.FIRSTNME,
E.LASTNAME = N.LASTNAME,
E.DEPTNO = N.DEPTNO
WHERE E.EMPNO = O.EMPNO;

With this trigger in place, the update statement above that previously got the -151 SQL code will execute successfully. Ta-da! Now the programmers won't have to reference different objects in their SELECT and UPDATE SQL statements - they'll just target the EMP_DEPT view in either case. Problem solved, with no need to change the underlying database design.

So, if you find yourself dealing with predicates that are non-indexable because the database design necessitates the coding of column scalar functions in the predicates, you have a very attractive remedy in the form of DB2 9 indexes on expressions. Similarly, if you want to put views on top of a database design to make some data-retrieval operations easier, and you don't want to have to direct programmers to the underlying tables (versus the views) for data-change operations, DB2 9 INSTEAD OF triggers may be just what the doctor ordered. Look for opportunities to put these features to work in your shop.

Previous Posts

Friday, May 8, 2009

DB2 9 for z/OS: Rx for Database Design Headaches

0 Comments:

Post a Comment