Catterall Consulting: Thoughts on DB2 Triggers

I was in the Upper Midwest of the USA for most of last week, presenting at three regional DB2 user group meetings - in Minneapolis, Chicago, and Milwaukee - on three consecutive days. One of the presentations I gave in each city covered DB2 for z/OS data warehouse performance. In that presentation, I provided some guidelines on the average number of indexes defined per table in a data warehouse database (I wrote of this in an entry posted to this blog last year). Following the meeting in Milwaukee, one of the attendees asked me if I had any recommendations pertaining to the number of triggers defined on a table. I don't, because trigger usage scenarios and environments vary so widely, but the question sparked an interesting discussion about DB2 triggers that covered a variety of sub-topics. By way of this entry, I'll commit these DB2 trigger thoughts of mine to paper (electronically speaking).

[Super-brief level-set: by way of a trigger defined on a DB2 table, one can cause an SQL-expressed action to be taken automatically in response to an update, delete, or insert targeting the base table. For example, one could use a trigger defined on table A to cause an insert into table A to drive an update of a column in table B.]

First, concerning that question on the number of triggers defined on a table, the answer is very much of the "it depends" variety. I recall a presentation, delivered at a DB2 user group meeting several years ago, in which a developer described a new application that his company had implemented entirely by way of triggers. The number of triggers created for that application was fairly large, and I'm thinking that quite a few triggers were defined on certain individual tables. The application was successfully put into production, and everything worked fine, so having a lot of triggers is not necessarily a bad thing. On the other hand, there are situations in which triggers can affect application performance in an undesirable way. In that regard, the story has gotten better in recent years, certainly on the mainframe platform. Triggers were introduced with DB2 Version 6 for z/OS (the functionality had previously been delivered for DB2 on Linux, UNIX, and Windows servers), and in that and the subsequent release the presence of a trigger defined with UPDATE OF COL5 on a table increased the CPU cost of any UPDATE statement targeting the table, even if the statement did not change data in column COL5. That trigger cost was eliminated in DB2 for z/OS Version 8, so that the aforementioned trigger would affect the performance only of UPDATE statements that changed data in COL5.

So, continuing with this example, how would the performance of a COL5-changing UPDATE statement be impacted by the UPDATE OF COL5 trigger? That would depend, of course, on the nature of the triggered action (i.e., the SQL statement executed as a result of the trigger being "fired" by the UPDATE). If the triggered action is an update of one row in one table, identified by a unique, indexed column referenced in a predicate, the impact of the trigger on the performance of COL5-changing UPDATE statements is likely to be minimal. If, on the other hand, the triggered action were more involved (and keep in mind that it could be a call to a stored procedure), the affect of the trigger on COL5-changing UPDATE statements would be more noticeable. The key here is to keep in mind that the action taken when a trigger is fired is synchronous with respect to an SQL statement that causes the trigger to fire. In other words, the trigger-firing SQL statement isn't finished until the triggered action is finished. This means that there are performance implications for "downstream" triggers that might be fired as a result of the initial trigger being fired (a trigger defined with UPDATE OF COL5 on table ABC could drive an update of COL7 on table XYZ, and that triggered action would fire a trigger if one were defined with UPDATE OF COL7 on table XYZ).

Does this synchronous business mean that triggers with more complex triggered actions are a performance no-no? Not necessarily. One way to have that cake and eat it, too, is to have the trigger place information of interest (e.g., certain column values) on a WebSphere MQ queue (a trigger can certainly do this - the triggered action has to be an SQL statement, and DB2 provides built-in functions, such as MQSEND, that can be used to send data to a designated MQ location). Once that's done, the statement that fired the trigger can complete execution. Asynchronously, with respect to the trigger-firing statement, the data sent to the MQ queue by the trigger can be processed as needed, perhaps by a DB2 stored procedure invoked by the MQ listener (the MQ listener function can automatically take an action, such as calling a stored procedure, when a message lands on a queue).

How do triggers stack up, in terms of CPU efficiency, with other means of getting database action X accomplished as a result of action Y being performed? Suppose that you have a need to capture "before" and "after" values in certain columns of table ABC when those columns are updated by programs. If program PROG1 updates the columns of interest in table ABC, you could request that the program be modified to insert into table XYZ "before" and "after" values following the table ABC updates. This approach might well be the most CPU-efficient way to address your requirement, but it could prove to be impractical for at least a couple of reasons. For one thing, who would code the requested PROG1 functional enhancement? Will that person - likely engaged now in some other high-priority application development effort - be available to change PROG1 to your liking within the next year? Maybe not. Then there's potential problem number two: what if the table ABC columns for which you want to capture changes are updated by multiple programs besides PROG1? Are you going to try to get change-capture functionality added to all of those programs? How long will that take? You could opt to use a vendor tool to detect and capture changes made to the specified columns of table ABC, but if such a tool isn't currently part of your IT infrastructure, how long will it take to acquire it and how much will it cost?

You could certainly determine that a trigger on table ABC defined with UPDATE OF [the columns of interest] would be the right way to go, offering a quickly implementable solution that would have a modest CPU cost and a very low dollar cost (or euro cost or whatever-currency cost ). And, consider this: if programs that update the table ABC columns in which you are interested are so response-time sensitive that even adding a fairly simple trigger to the mix raises performance concerns, having that trigger defined on a data warehouse table (or operational data store table) to which table ABC changes are propagated might do the trick for you.

Flexibility, agility, and economy - that's what DB2 triggers offer. They should definitely be solution candidates when you have a need for timely implementation of incremental database application functionality.

4 Comments:

Anonymous said...: Hi Robert, Why not just implement the column update using your data propagator tool, like DPROPR?; May 29, 2015 at 2:49 PM
Robert Catterall said...: (Sorry about the delayed response)

A data replication product (IBM's is now called IBM InfoSphere Data Replication) could be used to cause a change to a column in table A to be reflected in a column of table B. A trigger can be appropriate when the volume of trigger-firing data changes targeting a table is not very high.

Robert; June 8, 2015 at 8:30 PM
Unknown said...: Hi Robert, we need to do reporting on the existing db2 tables.
We are thinking to create a triggers for existing prod tables(DB2K) to capture all the changes and store the data in (DB2L). If we take this option, we have some questions.
What would be the impacts on DB2K on performance wise.
What will happen if we want to unload data form DB2L, while inserting data in DB2K?
What happens if DB2L has deadlocks or some other issues, would it impact DB2K?
It would be great help for me if you could throw some light on the above concerns.; May 10, 2016 at 4:47 PM
Robert Catterall said...: I would not go with this approach unless I had no other choice. The triggers on all the tables would increase the CPU consumption of the DB2 workload, and in case of a problem with an insert on the remote DB2 subsystem the insert (or update or whatever) on the "home" system could fail. What information would you be trying to capture, using triggers, that is not provided by the real-time statistics tables (http://robertsdb2blog.blogspot.com/2015/06/are-you-using-db2-for-zos-real-time.html) or by DB2 accounting and/or statistics trace records?

Robert; May 11, 2016 at 4:45 PM

Previous Posts

Monday, June 8, 2009

Thoughts on DB2 Triggers

4 Comments:

Post a Comment