<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-6806654330436722244</id><updated>2012-01-23T22:51:54.160-08:00</updated><title type='text'>Catterall Consulting</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default?start-index=101&amp;max-results=100'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>121</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-5803288204887880629</id><published>2010-08-05T19:04:00.000-07:00</published><updated>2010-08-05T19:14:06.357-07:00</updated><title type='text'>Back at IBM</title><content type='html'>&lt;span style="font-family: arial;"&gt;It's been longer than usual since my last post to this blog. I've been busier than usual, largely due to a career change. On August 2, I rejoined IBM (I previously worked for the Company from 1982 to 2000). I'm still doing technical DB2 work, and I'll continue to blog about DB2, but new entries will be posted to my new blog, called, simply, "Robert's DB2 blog." It can be viewed at http://www.robertsdb2blog.blogspot.com/.&lt;br /&gt;&lt;br /&gt;Though I will no longer be posting to this blog, I'll continue to respond to comments left by readers of entries in this blog.&lt;br /&gt;&lt;br /&gt;To those who've been regular readers, I thank you for your time. I hope that you'll visit my new blog.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-5803288204887880629?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/5803288204887880629/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/08/back-at-ibm.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/5803288204887880629'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/5803288204887880629'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/08/back-at-ibm.html' title='Back at IBM'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-649541144988681032</id><published>2010-07-12T22:08:00.000-07:00</published><updated>2010-07-12T22:25:28.732-07:00</updated><title type='text'>EXPLAINing DB2 for z/OS: Don't Overlook MATCHCOLS</title><content type='html'>&lt;span style="font-family: arial;"&gt;Do you use EXPLAIN output when you're analyzing the performance of a query in a DB2 for z/OS environment? I hope so. If you do, you might be one of those who likes to retrieve access plan information by querying the PLAN_TABLE into which DB2 inserts rows when a SQL statement is EXPLAINed (either dynamically via the EXPLAIN statement or, for embedded static SQL, as part of the program bind process when EXPLAIN(YES) is included in the BIND command); or, you might prefer to view EXPLAIN data in the form of an access plan graph generated by a tool such as IBM's Optimization Service Center for DB2 for z/OS (free), Data Studio (also free), or Optimization Expert (have to pay for that one, but you get additional functionality). In any case, there are probably a few things that you really focus in on. These might include tablespace scans (indicated by an "R" in the ACCESSYPE column of PLAN_TABLE -- something that you generally don't want to see), index-only access (that would be a "Y" in the INDEXONLY column of PLAN_TABLE -- often desirable, but you can go overboard in your pursuit of it), and sort activity (a "Y" in any of the SORTC or SORTN columns of PLAN_TABLE -- you might try to get DB2 to use an index in lieu of doing a sort).&lt;br /&gt;&lt;br /&gt;What about MATCHCOLS, the column of PLAN_TABLE that indicates the number of predicates that match on columns of the key of an index used to access data for a query (in a visual EXPLAIN access plan graph, this value would be referred to as "matching columns" for an index scan)? Do you scope that out? If so, do you just look for a non-zero value and move on? Here's a tip: pay attention to MATCHCOLS. Making that value larger is one of the very best things that you can do to reduce the run time and CPU cost of a SELECT statement. The performance difference between MATCHCOLS = 1 and MATCHCOLS = 2 or 3 can be huge. The reason? It's all about cardinality. You could have an index key based on three of a table's columns, and if the table has 10 million rows and the cardinality (i.e., the number of distinct values) of the first key column is 5, MATCHCOLS = 1 is an indication that you're going to get very little in the way of result set filtering at the index level (expect lots of GETPAGEs, a relatively high CPU cost, and extended run time). If the cardinality of the first two columns of the index key is 500,000, MATCHCOLS = 2 should mean much quicker execution, and if the full-key cardinality of the index is 5 million, a SELECT statement with MATCHCOLS = 3 should really fly. A larger MATCHCOLS value generally means more index-level filtering, and that is typically very good for performance.&lt;br /&gt;&lt;br /&gt;Step one in boosting the value of MATCHCOLS for a query is to determine how high the value &lt;span style="font-style: italic;"&gt;could&lt;/span&gt; be. Obviously, if a SELECT statement has only one predicate then any effort to make MATCHCOLS greater than 1 will be a waste of time. So, the upper bound on the value of MATCHCOLS in a given situation is the number of predicates present in a query (or, more precisely, in a subselect, if the statement contains multiple SELECTs -- think UNIONs, nested table expressions, subquery predicates, etc.); but, that's an oversimplification, because not all predicates can match on a column of an index key -- COL &lt;&gt; 2 is an example of a non-indexable predicate (my favorite source of information on predicate indexability is the &lt;a href="http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/topic/com.ibm.db29.doc.perf/db2z_summarypredicateprocessing.htm"&gt;"Summary of predicate processing"&lt;/a&gt; section of the DB2 9 for z/OS &lt;span style="font-style: italic;"&gt;Performance Monitoring and Tuning Guide&lt;/span&gt;). Even that's not the whole story, however, as the position of columns in an index key -- and not just the number of indexable predicates in a query -- has an impact on the MATCHCOLS value.&lt;br /&gt;&lt;br /&gt;This last point bears some explaining. Suppose you have a query with these predicates:&lt;br /&gt;&lt;br /&gt;COL_X = 'ABC'&lt;br /&gt;COL_Y = 5&lt;br /&gt;COL_Z &gt; 10&lt;br /&gt;&lt;br /&gt;All three of the predicates are indexable. If the three columns referenced in the predicates are part of an index key, will the MATCHCOLS value for the query be 3? Maybe, maybe not. If the index key is COL_X | COL_Y | COL_Z, MATCHCOLS will indeed be 3 (assuming the index provides the low-cost path to the data retrieved by the query). If, on the other hand, the index key is COL_Z | COL_Y | COL_X, MATCHCOLS will be 1. Why? Because once a key column is matched for a range predicate, other columns that follow in the index key won't be matches for other predicates in the query (the basic rule, which I'll amend momentarily, is that index column matching, which proceeds from the highest- to the lowest-order column of an index key, stops following the first match for a predicate that is not of the "equals" variety). If the index is on COL_X | COL_Y | COL_H | COL_Z, MATCHCOLS will be 2, because index key column matching stops if you have to "skip over" a non-predicate-referenced column (in this case, COL_H). So, one of the things that you can do to get a higher versus a lower MATCHCOLS value for a query is to arrange columns in an index key so as to maximize matching (that's fine if you're defining a new index on a table, but be careful about replacing an index on a set of columns in one order with an index on the columns in a different order -- you could improve the performance of one query while causing others to run more slowly).&lt;br /&gt;&lt;br /&gt;Now, about the aforementioned amendment to the basic rule that states that index key column predicate matching stops after the first match for something other than an "equals" predicate: there is at least one special case of a predicate that is not strictly "equals" in nature but which is treated that way from an index key column matching perspective. I'm talking about an "in-list" predicate, such as COL_D IN ('A01', 'B02', 'C03'). Thus, if a query contained that predicate and two others, COL_E = 'Y' AND COL_F &gt; 8 (assuming that the three columns belong to one table), and if the target table had an index defined on COL_D | COL_E | COL_F, the expected MATCHCOLS value for the query would be 3.&lt;br /&gt;&lt;br /&gt;Something else about indexes and MATCHCOLS: I mentioned earlier that cardinality is a big part of the story, but for cardinality to mean anything to DB2, DB2 has to &lt;span style="font-style: italic;"&gt;know&lt;/span&gt; about it. DB2 gets index key cardinality information from the catalog, so it pays to ensure that catalog statistics are accurate through regular execution of the RUNSTATS utility; but, there's more: to provide DB2 with really complete information on index key cardinality, specify the KEYCARD option on the RUNSTATS utility control statement. When the utility is run sans KEYCARD, cardinality statistics will be generated for the first column of an index key and for the full index key, but not for anything (if anything) in-between. In other words, with an index on COL_A | COL_B | COL_C | COL_D, an execution of RUNSTATS for the index without the KEYCARD option will cause the catalog to be updated with cardinality statistics for COL_A (the first of the key columns) and for the entire four-column key, but not for COL_A | COL_B and not for COL_A | COL_B | COL_C. With KEYCARD specified, the utility will get cardinality statistics for these intermediate partial-key combinations.&lt;br /&gt;&lt;br /&gt;A final note on this topic: it can make sense to include a predicate-referenced column in an index key, even if the predicate in question is non-indexable (meaning that the presence of the column in the key cannot help to make the MATCHCOLS number larger). This is so because the column could still be used by DB2 for something called index screening. So, if you have a non-indexable predicate such as one of the COL_J &lt;&gt; 'M' form, it might be a good idea to include COL_J in an index key after some predicate-matching columns, especially if COL_J is relatively short and the underlying table is large. DB2 can use the COL_J information in the index to filter results, just not as efficiently as it could if COL_J were referenced by an indexable predicate.&lt;br /&gt;&lt;br /&gt;Bottom line: MATCHCOLS matters. Tune on!&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-649541144988681032?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/649541144988681032/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/07/explaining-db2-for-zos-dont-overlook.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/649541144988681032'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/649541144988681032'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/07/explaining-db2-for-zos-dont-overlook.html' title='EXPLAINing DB2 for z/OS: Don&apos;t Overlook MATCHCOLS'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-8979457237373003472</id><published>2010-06-29T20:57:00.000-07:00</published><updated>2010-06-29T21:31:07.499-07:00</updated><title type='text'>Using DB2 for z/OS Real-Time Statistics for Smarter Database Management</title><content type='html'>&lt;span style="font-family: arial;"&gt;Unless you are somehow beyond the reach of advertising, you're probably familiar with IBM's "smarter planet" campaign. It's all about leveraging analytics -- the purposeful analysis of timely, relevant information -- to improve decision-making outcomes. If you administer a mainframe DB2 database, you can work smarter by taking advantage of a DB2 feature that, while familiar to many DBAs in an "I've heard of that" sense, is under-exploited to a surprising degree. I'm talking about real-time statistics (aka RTS). Understand real-time stats -- what they are, where you find them, and how you can use them -- and you're on your way to enhancing the efficiency and effectiveness of your DB2 database administration efforts.&lt;br /&gt;&lt;br /&gt;I'm sure that most everyone who reads this blog entry is familiar with the database statistics found in the DB2 catalog. These statistics are updated by way of the RUNSTATS utility, and they can be useful for things like identifying tablespaces and indexes in need of reorganization. Still, from a "work smarter" perspective, they are less than ideal. For one thing, they are only updated when you run the RUNSTATS utility (or when you gather and update statistics as part of a REORG or a LOAD utility operation -- more on that in a moment). How often do you do that? Maybe not too frequently, if you have a whole lot of tablespaces in your database. Suppose you run RUNSTATS, on average, once a month for a given tablespace. Could that tablespace end up getting pretty disorganized in the middle of one of those one-month periods between RUNSTATS jobs? Yes, and in that case you wouldn't be aware of the disorganization situation for a couple of weeks after the fact -- not so good.&lt;br /&gt;&lt;br /&gt;As for updating catalog stats via REORG and/or LOAD, that's all well and good, but consider this: when you do that, the stats gathered will reflect perfectly organized objects (assuming, for LOAD, that the rows in the input file are in clustering-key sequence). They won't show you how the organization of a tablespace and its indexes may be deteriorating over time.&lt;br /&gt;&lt;br /&gt;Then there's the matter of dynamic cache invalidation. ANY time you run the RUNSTATS utility -- no matter what options are specified -- you invalidate SQL statements in the dynamic statement cache. For a while thereafter, you can expect some extra CPU consumption as the statement cache gets repopulated through the full-prepare of dynamic queries that otherwise might have resulted in cache hits.&lt;br /&gt;&lt;br /&gt;So, there's goodness in getting frequently updated catalog statistics to help you determine when objects need to be reorganized, but running RUNSTATS frequently will cost you CPU time, both directly (the cost of RUNSTATS execution) and indirectly (the CPU cost of repopulating the dynamic statement cache following a RUNSTATS job). You could avoid these CPU costs by not using catalog stats to guide your REORG actions, relying instead on a time-based strategy (e.g., REORG every tablespace and associated indexes at least once every four weeks), but that might lead to REORG operations that are needlessly frequent for some tablespaces that remain well-organized for long stretches of time, and too-infrequent REORGs for objects that relatively quickly lose clusteredness. And I haven't even talked about tablespace backups. Getting a full image copy of every tablespace at least once a week, with daily incremental copies in-between, is a solid approach to recovery preparedness, but what if you're daily running incremental image copy jobs for objects that haven't changed since the last copy? How could you get smarter about that? And what about RUNSTATS itself? How can you get stats to help you make better decisions about updating catalog statistics?&lt;br /&gt;&lt;br /&gt;Enter real-time statistics. This is the name of an item of functionality that was introduced with DB2 for OS/390 Version 7. That was almost 10 years ago, and while the feature has been effectively leveraged by some DBAs for years, it's remains on the edge of many other DBAs' radar screens, largely for two reasons:&lt;br /&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;You (used to) have to create the real-time statistics objects yourself&lt;/span&gt;. I'm referring to the real-time statistics database (DSNRTSDB), the real-time stats tablespace (DSNRTSTS), two tables (SYSIBM.TABLESPACESTATS and SYSIBM.INDEXSPACESTATS), and a unique index on each of the tables. Instructions for creating these objects were provided in the DB2 Administration Guide, but some folks just didn't have the time or the inclination to bother with this. Happily, with DB2 9 for z/OS the real-time statistics objects became part of the DB2 catalog -- they are there for you like all the other catalog tables (if your DB2 subsystem is at the Version 8 level and the real-time statistics objects have already been created, when you migrate to DB2 9 any records in the user-created RTS tables will be automatically copied to the RTS tables in the catalog).&lt;/span&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;People had this idea that real-time  statistics drive up CPU overhead in a DB2 environment&lt;/span&gt;. They really  don't. You see, DB2 is always updating the real-time statistics counters  anyway, whether or not you make any use of them. What we know as  real-time statistics involves the periodic externalization of these  counters, and that's a  pretty low-cost operation (&lt;/span&gt;&lt;span style="font-family: arial;"&gt;the default RTS externalization interval is 30  minutes, and you can adjust that by way of the STATSINT parameter of ZPARM)&lt;/span&gt;&lt;span style="font-family: arial;"&gt;.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-family: arial;"&gt;So, if you are already on DB2 9, take a few minutes and check out the data in the SYSIBM.SYSTABLESPACESTATS and SYSIBM.SYSINDEXSPACESTATS catalog tables (in a pre-9 DB2 environment, the names of the user-defined RTS tables are -- as previously mentioned -- SYSIBM.TABLESPACESTATS and SYSIBM.INDEXSPACESTATS). You'll see that the column names are pretty intuitive (Hmmm, wonder what you'll find in the EXTENTS column of SYSTABLESPACESTATS? Or how about TOTALENTRIES in SYSINDEXSPACESTATS?). The theme is "news you can use," and a primary aim is to help you get to a needs-based strategy with regard to the execution of utilities such as REORG, RUNSTATS, and COPY, versus running these using only time-based criteria. To this end, RTS provides valuable information such as the total number of rows added to a tablespace since it was last reorganized (REORGINSERTS), the number of rows inserted out of clustering sequence since the last REORG (REORGUNCLUSTINS), the number of updates since the last RUNSTATS execution for a tablespace (STATSUPDATES), the number of data-change operations since a tablespace was last image-copied (COPYCHANGES), and the number of index leaf pages that are far from where they should be due to page splits that have occurred since the last time the index was reorganized or rebuilt (REORGLEAFFAR). Note, too, that in addition to the utility-related numbers, RTS provides, in a DB2 9 system, a column, called LASTUSED (in SYSINDEXSPACESTATS), that can help you identify indexes that are just taking up space (i.e., that aren't being used to speed up queries or searched updates or deletes, or to enforce referential integrity constraints).&lt;br /&gt;&lt;br /&gt;How will you leverage RTS? You have several options. You can process them using a DB2-supplied stored procedure (DSNACCOR for DB2 Version 8, and the enhanced DSNACCOX delivered with DB2 9). You might find that DB2 tools installed on your system -- from IBM and from other companies -- can take advantage of real-time statistics data (check with your tools vendors). DBAs who know a thing or two about the REXX programming language have found that they can write their own utility-automation routines thanks to RTS. And of course you can write queries that access the RTS tables and return actionable information. I encourage you to be creative here, but to get the juices flowing, here's an RTS query that I've used to find highly disorganized nonpartitioned tablespaces (this particular query was run in a DB2 Version 8 system -- it should work fine in a DB2 9 subsystem if you change TABLESPACESTATS to SYSTABLESPACESTATS):&lt;br /&gt;&lt;br /&gt;SELECT A.NAME,&lt;br /&gt;       A.DBNAME,&lt;br /&gt;       CAST(REORGLASTTIME AS DATE) AS REORGDATE,&lt;br /&gt;       CAST(FLOOR(TOTALROWS) AS INTEGER) AS TOTALROWS,&lt;br /&gt;       REORGINSERTS,&lt;br /&gt;       CAST((DEC(REORGUNCLUSTINS,11,2) / DEC(REORGINSERTS,11,2)) * 100&lt;br /&gt;         AS INTEGER) AS PCT_UNCL_INS,&lt;br /&gt;       REORGDELETES,&lt;br /&gt;       B.PCTFREE,&lt;br /&gt;       B.FREEPAGE&lt;br /&gt;  FROM SYSIBM.TABLESPACESTATS A, SYSIBM.SYSTABLEPART B&lt;br /&gt;    WHERE A.NAME = B.TSNAME&lt;br /&gt;      AND A.DBNAME = B.DBNAME                                      &lt;br /&gt;      AND TOTALROWS &gt; 10000&lt;br /&gt;      AND REORGUNCLUSTINS &gt; 1000&lt;br /&gt;      AND (DEC(REORGUNCLUSTINS,11,2) / DEC(REORGINSERTS,11,2)) * 100 &gt; 50&lt;br /&gt;      AND A.PARTITION = 0&lt;br /&gt;  ORDER BY 6 DESC &lt;br /&gt;  WITH UR;&lt;br /&gt;&lt;br /&gt;Real-time stats are going mainstream, folks. Be a part of that. Work smart.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-8979457237373003472?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/8979457237373003472/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/06/using-db2-for-zos-real-time-statistics.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/8979457237373003472'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/8979457237373003472'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/06/using-db2-for-zos-real-time-statistics.html' title='Using DB2 for z/OS Real-Time Statistics for Smarter Database Management'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-4756591472249030573</id><published>2010-06-09T10:08:00.000-07:00</published><updated>2010-06-09T10:39:55.384-07:00</updated><title type='text'>Nuggets from DB2 by the Bay, Part 4</title><content type='html'>&lt;span style="font-family:arial;"&gt;The last of my posts with items of information from the 2010 International DB2 Users Group North American Conference, held last month in Tampa (as in Tampa Bay), Florida.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Good DB2 9 for z/OS migration information from David Simpson.&lt;/span&gt; David, a senior DB2 instructor with Themis Training, described some things of which people migrating to DB2 9 should be aware. Among these are the following:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;The pureXML functionality delivered in DB2 9 is quite comprehensive and opens up a lot of possibilities. &lt;/span&gt;One of David's colleagues at Themis figured out how to create views that make data stored in XML columns of DB2 tables look like standard relational data.&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;Do you have a handle on your simple tablespace situation?&lt;/span&gt; David reminded session attendees that simple tablespaces cannot be created in a DB2 9 environment. This being the case, it would be a good idea to move data from the simple tablespaces that you have to other tablespace types (segmented tablespaces, most likely). Sure, you can still read from, and update, a simple tablespace in a DB2 9 system, but the inability to create such a tablespace could leave you in a tough spot if you were to try to recover a simple tablespace that had been accidentally dropped (David suggested that people create a few empty simple tablespaces before migrating to DB2 9, so you'll have some available just in case you need a new one). You might think that you don't have any simple tablespaces in your DB2 Version 8 system, but you could be wrong there -- David pointed out that simple tablespaces are the default up through DB2 Version 8 (so, an implicitly-created tablespace in a pre-DB2 9 environment will be a simple tablespace).&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;New RRF developments.&lt;/span&gt; That's RRF as in reordered row format, a change (introduced with DB2 9) in the way that columns are physically ordered in a table. In the old set-up (now referred to as BRF, or basic row format), varying-length columns in a table (such as those with a VARCHAR data type) are physically stored, relative to other columns, in the order in which they appear in the CREATE TABLE statement. With RRF in effect, varying-length columns are grouped at the end of a table's rows, and that group of varying-length columns is preceded by a set of offset indicators -- one for each varying-length column -- that enable DB2 to very efficiently go right to the start of a given varying-length column. David told attendees that RRF does NOT affect what programs "see", as the logical order of a table's columns does not change with a change to RRF. RRF is a good thing with respect to varying-length-data access performance, but it may cause some issues when tablespaces are compressed (RRF rows sometimes don't compress quite as much as equivalent BRF rows), when data changes related to tables in compressed tablespaces are propagated via "log-scraping" replication tools (you just need to make sure that your replication tool can deal with the new compression dictionary that is created when a tablespace goes from BRF to RRF), and when tablespaces are operated on by the DB2 DSN1COPY utility (this utility doesn't use the SQL interface, so it is sensitive to changes in the physical order of columns even when this has no effect on the columns' logical order in a table).&lt;br /&gt;&lt;br /&gt;Early on with DB2 9, the change from BRF to RRF was automatic with the first REORG in a DB2 9 environment of a tablespace created in a pre-9 DB2 system. Various DB2 users asked for more control over the row-format change, and IBM responded with APARs like &lt;a href="http://www-01.ibm.com/support/docview.wss?uid=swg1PK85881"&gt;PK85881&lt;/a&gt; and &lt;a href="http://www-01.ibm.com/support/docview.wss?uid=swg1PK87348"&gt;PK87348&lt;/a&gt;. You definitely want to get to RRF at some point. With the fixes provided by these APARs, you can decide if you want BRF-to-RRF conversion to occur automatically with some utility operations (REORG and LOAD REPLACE), or if you want to explicitly request format conversion on a tablespace-by-tablespace basis. You can also determine whether or not you want tablespaces created in a DB2 9 environment to have BRF or RRF rows initially.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt; &lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;Time to move on from Visual Explain.&lt;/span&gt; David mentioned that VE is not supported in the DB2 9 environment -- it doesn't work with new DB2 9 data types (such as XML), and it can produce "indeterminate results" if a DB2 9 access plan is not possible in a DB2 Version 8 system. If you want a visual depiction of the access plan for a query accessing a DB2 9 database, you can use the free and downloadable &lt;a href="http://www-01.ibm.com/support/docview.wss?rs=64&amp;amp;uid=swg27017059"&gt;IBM Optimization Service Center for DB2 for z/OS&lt;/a&gt;, or &lt;a href="http://www-01.ibm.com/software/data/optim/data-studio/features.html?S_CMP=rnav"&gt;IBM Data Studio&lt;/a&gt;.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;Bye-bye, BUILD2.&lt;/span&gt; David explained that partition-level online REORG in a DB2 9 system does not have a BUILD2 phase (in prior releases of DB2, this is the REORG utility phase during which row IDs in non-partitioned indexes are updated to reflect the new position of rows in a reorganized table partition). That's good, because data in a partition is essentially unavailable during the BUILD2 phase, and BUILD2 can run for quite some time if the partition holds a large number of rows. There's a catch, though: BUILD2 is eliminated because DB2 9 partition-level online REORG reorganizes non-partitioned indexes in their entirety, using shadow data sets. That means more disk space and more CPU time for partition-level REORG in a DB2 9 system. It also means that you can't run multiple online REORG jobs for different partitions of the same partitioned tablespace in parallel. You can get parallelism within one partition-level online REORG job if you're reorganizing a range of partitions (e.g., partitions 5 through 10). Note that in a DB2 10 environment, you can get this kind of intra-job parallelism for an online REORG even if the multiple partitions being reorganized are not contiguous (e.g., partitions 3, 7, 10, and 15).&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;DB2 for z/OS and application programming.&lt;/span&gt; Dave Churn, a database architect at DST Systems in Kansas City, delivered a session on application development in a DB2 context. David commented on a number of application-oriented DB2 features and functions, including these:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;Fetching and inserting chunks of rows.&lt;/span&gt; DST has made some use of the multi-row FETCH and INSERT capabilities introduced with DB2 for z/OS Version 8. Dave said that performance benefits had been seen for programs that FETCH rows in blocks of 5-10 rows each, and for programs that INSERT rows in blocks of 20 rows each. The other side of that coin is increased programming complexity (Dave noted that with multi-row FETCH, you're "buffering in your program"). In DST's case, multi-row FETCH is not being used to a great extent, because the increased time required for programmers to write code to deal with multi-row FETCH (versus using traditional single-row FETCH functionality) is generally seen as outweighing the potential performance gain (and that gain will often not be very significant in an overall sense -- as Dave said, "How often is FETCH processing your primary performance pain point?").&lt;br /&gt;&lt;br /&gt;Use of multi-row INSERT, on the other hand, has been found to be more advantageous in the DST environment, particularly with respect to the Company's very high-volume, time-critical, and INSERT-heavy overnight batch workload. As with multi-row FETCH, there is an increase in programming complexity associated with the use of multi-row INSERT (among other things, to-be-inserted values have to be placed in host variable arrays declared by the inserting program), but the performance and throughput benefits often made the additional coding effort worthwhile. Interestingly, others in the audience indicated that they'd seen the same pattern in their shops: multi-row INSERT was found to be of greater use than multi-row FETCH. Dave mentioned that at DST, programs using multi-row INSERT were generally doing so with the NOT ATOMIC CONTINUE ON SQLEXCEPTION option, which causes DB2 to NOT undo successful inserts of rows in a block if an error is encountered in attempting to insert one or more rows in the same block. The programs use the GET DIAGNOSTICS statement to identify any rows in a block that were not successfully inserted. These rows are written to a file for later analysis and action.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;The new BINARY data type in DB2 9 can be great for some client-server applications.&lt;/span&gt; When DB2 for z/OS is used for the storage of data that is inserted by, and retrieved by, programs running on Linux/UNIX/Windows application servers, the BINARY data type can be a good choice: if the data will not be accessed by programs running on the mainframe, why do character conversion? Use of the BINARY data type ensures that character conversion will not even be attempted when the data is sent to or read from the DB2 database.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;DB2 9 FETCH WITH CONTINUE is useful for really big LOBs.&lt;/span&gt; In some cases, a LOB value might be larger than what a COBOL program can handle (which is about 128 MB). The FETCH WITH CONTINUE functionality introduced with DB2 9 enables a COBOL program to retrieve a very large LOB is parts.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;MERGE musings.&lt;/span&gt; The MERGE statement, new with DB2 9 and sometimes referred to as "UPSERT", is very handy when a set of input records is to be either inserted into a DB2 table or used to update rows already in the table, depending on whether or not an input record matches an existing table row. Dave mentioned that the matching condition (specified in the MERGE statement) will ideally be based on a unique key, so as to limit the scope of the UPDATE that occurs when a "no match" situation exists. DST likes MERGE because it improves application efficiency (it reduces the number of program calls to DB2 versus the previously-required INSERT-ELSE-UPDATE construct) and programmer productivity (same reason -- fewer SQL statements to code). Dave said that DST has used MERGE with both the ATOMIC and NOT ATOMIC CONTINUE ON SQLEXCEPTION options (when the latter is used for MERGE with a multi-record input block, GET DIAGNOSTICS is used to determine what, if any, input records were not successfully processed -- just as is done for multi-row INSERT).&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;SELECT FROM UPDATE/INSERT/DELETE/MERGE is great for efficiently obtaining DB2-generated or DB2-modified values.&lt;/span&gt; DST has used the SELECT FROM &lt;span style="font-style: italic;"&gt;data-changing-statement&lt;/span&gt; syntax (introduced for INSERT with DB2 Version 8, and expanded to other data-changing statements with DB2 9) to obtain values generated by BEFORE triggers on DB2 tables (as an aside, Dave mentioned that DST has used triggers to, among other things, dynamically change a program's commit frequency). DST has also found it useful to execute SELECT FROM MERGE statements with the INCLUDE option (enabling return of values not stored in a target table) to determine whether rows in a MERGE input block were inserted or used to update the target table.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;When will you use the new DB2 9 XML data type?&lt;/span&gt; You'll use it, Dave said, when "your clients want to exchange information with you in the form of XML documents." In other words, you're likely to use it when your company's clients make XML data exchange a requirement for doing business. DST is using DB2 9 pureXML now. You might want to get ready to use it, just in case you'll need to. Being prepared could make exploitation of the technology an easier process (and it is pretty amazing whet DB2 can do with XML documents, in terms of indexability, schema validation, and search and retrieval using XQUERY expressions embedded in SQL statements).&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;That's a wrap for this multi-part post. I hope that part 4 has provided you with some useful information, and I invite you to check out parts &lt;a href="http://catterallconsulting.blogspot.com/2010/05/nuggets-from-db2-by-bay-part-1.html"&gt;1&lt;/a&gt;, &lt;a href="http://catterallconsulting.blogspot.com/2010/05/nuggets-from-db2-by-bay-part-2.html"&gt;2&lt;/a&gt;, and &lt;a href="http://catterallconsulting.blogspot.com/2010/06/nuggets-from-db2-by-bay-part-3.html"&gt;3&lt;/a&gt;, if you haven't already done so. The &lt;a href="http://www.idug.org/db2-north-american-conference/idug-2011-north-america.html"&gt;IDUG 2011 North American Conference&lt;/a&gt; will be held in Anaheim, California next May. I'm planning on being there, and I hope that many of you will be there as well. It's always a great source of DB2 "news you can use."&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-4756591472249030573?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/4756591472249030573/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/06/nuggets-from-db2-by-bay-part-4.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/4756591472249030573'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/4756591472249030573'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/06/nuggets-from-db2-by-bay-part-4.html' title='Nuggets from DB2 by the Bay, Part 4'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-2429884496563981162</id><published>2010-06-01T21:49:00.000-07:00</published><updated>2010-06-01T22:09:43.120-07:00</updated><title type='text'>Nuggets from DB2 by the Bay, Part 3</title><content type='html'>&lt;span style="font-family: arial;"&gt;Still more items of information from the 2010 International DB2 Users Group North American Conference held last month in Tampa, Florida.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;A new chapter in the history of the DB2 optimizer. &lt;/span&gt;Terry Purcell, uber-optimizer-guy on IBM's DB2 for z/OS development team, delivered an excellent session on new query optimizer features in the DB2 10 environment. These goodies include:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;An new option allows you to get dynamic statement cache matching for SQL statements that have different literal values but are otherwise identical.&lt;/span&gt; Prior to DB2 10, matching was possible only for statements that were identical on a byte-for-byte basis: such statements would either contain parameter markers or identical literal values. The CPU efficiency attained through statement matching with different literals won't be quite what you get when statements containing parameter markers are matched, but it should be pretty close.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;How about PERSISTENT caching of prepared dynamic SQL statements?&lt;/span&gt; Dynamic statement caching is great, but when a prepared statement gets flushed out of the cache, it's gone, right? Not any more, folks. DB2 10 (in New Function Mode) will provide an access path repository in the catalog that will enable you to stabilize -- in a long-term way -- access paths for dynamic SQL statements (a "game-changer," Terry called it). When this option is in effect, DB2 will look first to match an incoming dynamic statement with a statement in the repository, then (if not found in the repository) in the dynamic statement cache. If neither of these matching attempts is successful, DB2 will dynamically prepare the statement. Want to change a path in the repository? You'll be able to do so by rebinding at the query level. By the way, the repository will also enable a more-robust implementation of DB2's access path hint functionality: it will be possible to put a hint into the repository, so you'll no longer have to provide a query number value in order to use a hint.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;Mass rebind? No problem.&lt;/span&gt; A new REBIND option, APREUSE(YES), will instruct DB2 to generate a new control structure for a package (to take advantage of a service fix, for example) while retaining the existing access path, if possible. If the package's old access path can't be reused for some reason, a new one will be generated. And, speaking of new and different access paths, another DB2 10-delivered REBIND option, APCOMPARE(ERROR), can be used to tell DB2 to issue an error message if a rebind operation changes an access path (you can optionally have DB2 issue a warning instead of an error). Going forward, when you want to do a mass rebind of packages as part of a version-to-version DB2 migration, you may well want to do your rebinds with APREUSE(YES) and APCOMPARE(ERROR).&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;More user-friendly access plan stability.&lt;/span&gt; Lots of people like the access plan stability capability that was delivered with DB2 9 for z/OS via the new PLANMGMT option of the REBIND command. Nice as that is, it could be a hassle trying to get information about a version of a package's access plan other than the one currently in use. DB2 10 will address that problem with a new catalog table, SYSPACKCOPY, that will provide metadata for previous and original copies of access plans.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;Playing it safe when choosing access paths.&lt;/span&gt; DB2 has always gone for lowest cost when choosing an access path for a query. Sometimes, that can be a problem for a statement with one or more host variables in its predicates, as the path identified as lowest-cost might result in really bad performance for certain variable values. The DB2 10 optimizer, older and wiser, will consider risk (i.e., the chance of getting poor performance for certain statement variable values) as well as cost in determining the optimal path for a SQL statement.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;Staying with the RID option.&lt;/span&gt; Some folks cringe when DB2 starts down the path of using a RID list in executing a query (perhaps for multi-index access), and then switches to a tablespace scan because a RID limit was reached. DB2 10 can overflow a big RID list to a workfile and keep on trucking. Will you need more workfile space as a result? Perhaps, but note that the spillover effect is mitigated by a new larger default size for the RID pool in a DB2 10 environment.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;In-list advances.&lt;/span&gt; DB2 10 extends its predicate transitive closure capability (the ability to determine a relationship between A and C based on A-B and B-C relationships) to in-list predicates. DB2 10 can also use matching index access for multiple in-list predicates in a query (prior to DB2 10, if a query had several in-list predicates, only one of these could be used for a matching index scan). And one more thing: DB2 10 can take several OR-connected predicates that match on one index and convert them to a single in-list predicate to generate a result set (that's more efficient than using "index ORing" for the predicates, as is done in a pre-10 DB2 system).&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;Query parallelism enhancements.&lt;/span&gt; With DB2 10, you can get parallelism for multi-row FETCH operations (though not for an ambiguous cursor). DB2 10 also enables parallel query tasks to share workfiles. And, in DB2 10, something called "dynamic record range partitioning" can be used to cause data in table A to be split into partitions that "line up" with the partitions of table B, the result being improved parallel table-join processing. This does introduce a data sort, but the expectation is that the technique will be used when table A is on the small side, so the sort shouldn't be a big deal.=&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;A RUNSTATS efficiency boost. &lt;/span&gt;It used to be that sampling provided some help in reducing the CPU cost of a RUNSTATS utility job. With DB2 10, sampling provides a LOT of help in this department, because the sampling percentage now applies to the percentage of pages examined (it used to refer to the percentage of data rows examined in gathering statistics for non-indexed columns -- you could sample 25% of the rows, but end up accessing ALL of the tablespace's pages). What's more, there's an optional specification that you can use to tell DB2 to figure out the sampling percentage to use for an execution of RUNSTATS.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;Where to go for more DB2 10 information.&lt;/span&gt; IBM's Roger Miller has provided information about DB2 10 presentations available on IBM's Web site. They are in a folder accessible via this url: &lt;a href="ftp://public.dhe.ibm.com/software/data/db2/zos/presentations/v10-new-function/"&gt;ftp://public.dhe.ibm.com/software/data/db2/zos/presentations/v10-new-function/&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;In this folder, you'll find these presentations that were delivered at the IDUG conference in Tampa:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Session A01: DBA improvements, by Roger Miller&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Session B02: What's new from the optimizer, by Terry Purcell&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Session A03: DB2 10 Performance Preview, by Akiko Hoshikawa&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Session A06: DB2 and System z Synergy, by Chris Crone&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Session A08: DB2 10 Availability Enhancements, by Haakon Roberts&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;You'll also find these presentations from the recent IBM Information on Demand European Conference:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Session 2908: DB2 10 for z/OS security features help satisfy your auditors, by Jim Pickel&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Session 2894: What’s coming from the optimizer in DB2 10 for z/OS, by Terry Purcell&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Session 3010: New pureXML Features in DB2 for z/OS: Breaking Relational Limits, by Guogen (Gene) Zhang&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;I'll be back here in a few days with still more from my notes taken at the IDUG Conference in Tampa. Ciao for now.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-2429884496563981162?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/2429884496563981162/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/06/nuggets-from-db2-by-bay-part-3.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2429884496563981162'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2429884496563981162'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/06/nuggets-from-db2-by-bay-part-3.html' title='Nuggets from DB2 by the Bay, Part 3'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-372362531536139588</id><published>2010-05-24T21:52:00.000-07:00</published><updated>2010-05-24T22:08:38.162-07:00</updated><title type='text'>Nuggets from DB2 by the Bay, Part 2</title><content type='html'>&lt;span style="font-family: arial;"&gt;More items of information from the 2010 International DB2 Users Group North American Conference, held earlier this month in Tampa (in fact by the bay -- the convention center is on the waterfront).&lt;br /&gt;&lt;br /&gt;IBM's Roger Miller delivered a session on DB2 for z/OS Version 10 (in beta release for the past couple of months) with typical enthusiasm (opening line: "This is a BIG version"). Some of his points:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Much attention was paid to making life easier for DBAs.&lt;/span&gt; Among the labor-saving features of DB2 10:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Automated collection of catalog stats, so you don't have to mess with the RUNSTATS utility if you don't want to.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Worry-free scale-up, with DB2 thread-related virtual storage now above the 2 GB "bar" in the DB2 database services address space. The number of concurrently active threads can go WAY up in a DB2 10 environment.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Access path stability, a favorite of many DB2 9 users, is enhanced. That makes for worry-free rebinding of packages (and you'll want to rebind to get the aforementioned below-the-bar virtual storage constraint relief, and to get the benefit of optimizer improvements).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Reduced catalog contention will allow for more concurrency with regard to CREATE/ALTER/DROP activity.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;The ability to build a tablespace compression dictionary on the fly removes what had been a REORG utility execution requirement.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;Resiliency, efficiency, and growth:&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;DB2 9 for z/OS gave us some nice utility performance enhancements (referring particularly to reduced CPU consumption). DB2 10 delivers significant CPU efficiency gains for user application workloads (batch and OLTP).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;More-granular DB2 authorities enable people to do their jobs while improving the safeguarding of data assets. An example is SECADM, which does not include data access privileges.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;"Release 2" of DB2's pureXML support improves performance for applications that access XML data stored in a DB2 database.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;An "ALTER-then-REORG" path makes it easier to convert existing tablespaces to the universal tablespace type introduced with DB2 9.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;Counting down.&lt;/span&gt; Roger's 10 favorite DB2 10 features:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;10. (tie) Hash access to data records (do with one GETPAGE what formerly might have required five or so GETPAGEs), and index "include" columns (define a unique index, then include one or more additional columns to improve the performance of some queries while retaining the original uniqueness-enforcement characteristic of the index).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;9. Improved XML data management performance and usability.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;8. Improved SQL portability.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;7. Support for temporal (i.e., "versioned") data (something that previously had to be implemented by way of application code).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;6. The new, more-granular security roles.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;5. More online schema changes.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;4. Better catalog concurrency.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;3. 5X-10X more concurrent users due to the removal of memory constraints.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;2. CPU cost reductions for DB2-accessing application programs.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;1. Productivity improvements.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;You get a lot of benefits in DB2 10 conversion mode:&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;More CPU-efficient SQL.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;64-bit addressing support for more of the EDM pool and for DB2 runtime structures (thread-related virtual storage usage). You'll need to rebind to get this benefit.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Improved efficiency of single-row retrieval operations outside of singleton SELECTs, thanks to OPEN/FETCH/CLOSE chaining.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Distributed thread reuse for high-performance database access threads (aka DBATs).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Improved elapsed times for insert operations, thanks to parallelized index updates (for tables on which multiple indexes have been defined).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Support for 1 MB pages.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Access path enhancements, including the ability to get index matching for multiple in-list predicates in a query.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;More query parallelism (good for zIIP engine utilization).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;More avoidance of view materialization (good for efficiency).&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;More stuff:&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Dynamic statement cache hits for statements that are identical except for the values of literals (this requires the use of a new attribute setting).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;CPU efficiency gains of up to 20% for native SQL procedures (you regenerate the runtime structure via drop and recreate).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Hash access to data rows.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Index include columns.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;In-line LOBs (storing of smaller LOB values in base table rows). Roger called these smaller LOBs "SLOBs." LOBs stored in-line in a compressed tablespace will be compressed. In-line storage of LOBs will require a universal tablespace that's in reordered row format (RRF). Said Roger: "RRF is the future."&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Universal tablespaces can be defined with the MEMBER CLUSTER attribute (good for certain high-intensity insert operations, especially in data sharing environments).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;"ALTER-then-REORG" to get to a universal tablespace, to change page size, to change DSSIZE (size of a tablespace partition), and to change SEGSIZE. With respect to "ALTER-then-REORG," you'll have the ability to reverse an "oops" ALTER (if you haven't effected the physical change via REORG) with an ALTER TABLESPACE DROP PENDING CHANGES.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Online REORG for all catalog and directory tablespaces.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;Scalability improvements:&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Reduced latch contention.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;A new option that lets data readers avoid having to wait on data inserters.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Much more utility concurrency.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;64-bit common storage to avoid ECSA constraints.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Package binds, data definition language processes, and dynamic SQL can run concurrently.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;The skeleton package table in the directory will use LOBs. With CLOBs and BLOBs in the DB2 directory, the DSN1CHKR utility won't be needed because there won't be any more links to maintain.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;SMF records produced by DB2 traces can be compressed: major space savings (maybe 4:1) with a low cost in terms of overhead.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;"Monster" buffer pools can be used with less overhead.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;You'll be able to dynamically add active log data sets.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;You'll be able to grant DBADM authority to an ID for all databases, versus having to do this on a database-by-database basis.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;More catalog and directory changes:&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;The catalog and directory will utilize partition-by-growth universal tablespaces (64 GB DSSIZE).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;There will be more tablespaces (about 60 more).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Row-level-locking will be used.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;The objects will be DB2-managed and SMS-controlled.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;It really is a BIG version -- and there's still more to it (I've just provided what I captured in my note-taking during Roger's session). More nuggets to come in my part 3 post about DB2-by-the-bay. Stay tuned.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-372362531536139588?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/372362531536139588/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/05/nuggets-from-db2-by-bay-part-2.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/372362531536139588'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/372362531536139588'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/05/nuggets-from-db2-by-bay-part-2.html' title='Nuggets from DB2 by the Bay, Part 2'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-369496412012032324</id><published>2010-05-17T05:54:00.000-07:00</published><updated>2010-05-17T06:35:23.887-07:00</updated><title type='text'>Nuggets from DB2 by the Bay, Part 1</title><content type='html'>&lt;span style="font-family: arial;"&gt;I had to smile when I saw the thread that Ed Long of Pegasystems started the other day on the &lt;a href="http://www.idug.org/cgi-bin/wa?A0=DB2-L"&gt;DB2-L discussion list&lt;/a&gt;. The subject line? "IDUG radio silence." Here it was, IDUG NA week, with the North American Conference of the International DB2 Users Group in full swing in Tampa, Florida, and the usual blogging and tweeting of conference attendees was strangely absent. What's with that? Well, I'll break the silence (thanks for the inspiration, Ed), and I'll start by offering my theory as to why the level of conference-related electronic communication was low: we were BUSY.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;Busy&lt;/span&gt;. That's the word that comes first to mind when I think of this year's event, and I mean it in a good way. At last year's conference in Denver, the mood was kind of on the down side. Attendance was off, due largely to severe cutbacks in organizations' training and travel budgets - a widespread response to one bear of an economic downturn. Those of us who were able to make it to last May's get-together swapped stories with common themes: How tough is it on you? How many people has your company cut? How down is your business? A lot of us were in batten-down-the-hatches mode, and it was hard to get the ol' positive attitude going.&lt;br /&gt;&lt;br /&gt;What a difference a year makes. The vibe at the Tampa Convention Center was a total turnaround from 2009. Attendance appeared to be up significantly, people were smiling, conversation was upbeat and animated, and there was this overall sense of folks being on the move: heading to this session or that one, flagging someone down to get a question answered, lining up future business, juggling conference activities with work-related priorities -- stuff that happens, I guess, at every conference, but it seemed to me that the energy level was up sharply versus last May. To the usual "How's it going" question asked of acquaintances not seen since last year, an oft-heard response was: "Busy!" To be sure, some folks (and I can relate) are crazy busy, trying to work in some eating and sleeping when the opportunity arises, but no one seemed to be complaining. It felt OK to be burning the candle at both ends after the long dry spell endured last year. Optimism is not in short supply, and I hope these positive trends will be sustained in the months and years to come.&lt;br /&gt;&lt;br /&gt;In this and some other entries to come (not sure how many -- probably another couple or so) I'll share with you some nuggets of information from the conference that I hope you'll find to be interesting and useful. I'll start with the Tuesday morning keynote session.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The data tsunami: big challenges, but big opportunities, too.&lt;/span&gt; The keynote speaker was Martin Wildberger, IBM's VP of Data Management Software Development. He started out talking about the enormous growth in the amount of data that organizations have to manage -- this on top of an already-enormous base. He showed a video with comments by some of the leading technologists in his group, and one of those comments really stuck with me (words to this effect): "You might think that the blizzard of data coming into an organization would blind you, but in fact, the more data you have, the clearer you see." Sure, gaining insight from all that data doesn't just happen -- you need the right technology and processes to &lt;span style="font-style: italic;"&gt;make&lt;/span&gt; it happen -- but the idea that an organization can use its voluminous data assets to see things that were heretofore hidden -- things that could drive more revenue or reduce costs -- is compelling. As DB2 people, we work at the foundational level of the information software "stack." There's lots of cool analytics stuff going on at higher levels of that stack, but the cool query and reporting and cubing and mining tools just sit there if the database is unavailable. And, data has to get to decision-makers fast. And, non-traditional data (images, documents, XML) has to be effectively managed right along with the traditional numbers and character strings. Much will be demanded of us, and that's good (it'll keep us &lt;span style="font-style: italic;"&gt;busy&lt;/span&gt;).&lt;br /&gt;&lt;br /&gt;Martin mentioned that IBM's overall information management investment priorities are aimed at helping organizations to:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Lower costs&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Improve performance&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Reuse skills&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Reduce risk&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Reduce time-to-value&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Innovate&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;He talked up IBM's partnership with Intel, IBM's drive to make it easier for companies to switch to DB2 from other database management systems (especially Oracle), and the "game-changing" impact of DB2 &lt;a href="http://catterallconsulting.blogspot.com/2009/10/wow-db2-data-sharing-comes-to-aixpower.html"&gt;pureScale&lt;/a&gt; technology, which takes high availability in the distributed systems world to a whole new level. Martin also highlighted the Smart Analytics Systems, including the 9600 series, a relatively new offering on the System z platform (this is a completely integrated hardware/software/services package for analytics and BI -- basically, the "appliance" approach -- that has been available previously only on the IBM Power and System x server lines). There was also good news on the cloud front: DB2 is getting a whole lot of use in Amazon's cloud.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;DB2 10 for z/OS: a lot to like.&lt;/span&gt; John Campbell, an IBM Distinguished Engineer with the DB2 for z/OS development organization, took the stage for a while to provide some DB2 10 for z/OS highlights (this version of DB2 on the mainframe platform is now in Beta release):&lt;/span&gt;&lt;span style="font-family: arial;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;CPU efficiency gains.&lt;/span&gt; &lt;/span&gt;&lt;span style="font-family: arial;"&gt;For programs written in SQL procedure language, or SQLPL (used to develop "native" SQL procedures and -- new with DB2 10 -- SQL user-defined functions), CPU consumption could be reduced by up to 20% versus DB2 9.&lt;/span&gt;&lt;span style="font-family: arial;"&gt; Programs with embedded SQL could see reduced in-DB2 CPU cost (CPU cost of SQL statement execution) of up to 10% versus dB2 9, just by being rebound in a DB2 10 system.&lt;/span&gt;&lt;span style="font-family: arial;"&gt; High-volume, concurrent insert processes could see in-DB2 CPU cost reductions of up to 40% in a DB2 10 system versus DB2 9.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;64-bit addressing for DB2 runtime structures.&lt;/span&gt; John's "favorite DB2 10 feature." With DB2 thread storage going above the 2 GB virtual storage "bar" in a DB2 10 system (after a rebind in DB2 10 Conversion Mode), people will have options that they didn't before (greater use of the RELEASE(DEALLOCATE) bind option, for one thing). DB2 subsystem failures are rare, but when they do happen it's often because of a virtual storage constraint problem. DB2 10 squarely addresses that issue.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;Temporal data.&lt;/span&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt; &lt;/span&gt;This refers to the  ability to associate "business time" and "system time" values to data  records. John pointing out that the concept isn't new. What's new is  that the temporal data capabilities are in the DB2 engine, versus having  to be implemented in application code.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;Getting to universal.&lt;/span&gt; John pointed out  that DB2 10 would provide an "ALTER-then-REORG" path to get from  segmented and partitioned tablespaces to universal tablespaces.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;Access plan stability. &lt;/span&gt;This is a  capability in DB2 10 that can be used to "lock down" access paths for  static AND dynamic SQL.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;Enhanced dynamic statement caching.&lt;/span&gt; In a DB2 10 environment, a dynamic query with literals in the predicates  can get a match in the prepared statement cache with a statement that  is identical except for the literal values (getting a match previously  required the literal values to match, too).&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;span style="font-family: arial;"&gt; &lt;span style="font-weight: bold;"&gt;DB2 for LUW performance.&lt;/span&gt; John was followed on stage by Berni Schiefer of the DB2 for Linux/UNIX/Windows (LUW) development team. Berni shared some of the latest from the DB2 for LUW performance front:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Performance PER CORE is not an official TPC-C metric, but it matters, because the core is the licensing unit for LUW software. It's on a per-core basis that DB2 for LUW performance really shines versus Sun/Oracle and HP systems.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;SAP benchmarks show better performance versus competing platforms, with FEWER cores.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;TPC-C benchmark numbers show that DB2 on the IBM POWER7 platform excels in terms of both performance (total processing power) AND price/performance.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;DB2 is number one in terms of Windows system performance, but the performance story is even better on the POWER platform.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Berni pointed out that DB2 is the ONLY DBMS that provides native support for the DECFLOAT data type (based on the IEEE 754r standard for decimal floating point numbers). The POWER platform provides a hardware boost for DECFLOAT operations.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;DB2 for LUW does an excellent job of exploiting flash drives.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;Back to Martin for a keynote wrap-up.&lt;/span&gt; Martin Wildberger came back on stage to deliver a few closing comments, returning to the topic of DB2 pureScale. pureScale is a distributed systems (AIX/POWER platform) implementation of the shared data architecture used for DB2 for z/OS data sharing on a parallel sysplex mainframe cluster. That's a technology that has delivered the availability and scalability goods for 15 years. So now DB2 for AIX delivers top-of-class scale-up AND scale-out capabilities.&lt;br /&gt;&lt;br /&gt;Martin closed by drawing attention to the IBM/IDUG &lt;a href="http://www.idug.org/do-you-db2/do-you-db2-front-page.html"&gt;"Do You DB2?"&lt;/a&gt; contest. Write about your experience in using DB2, and you could win a big flat-screen TV. If you're based in North America, check it out (this initial contest does have that geographic restriction).&lt;br /&gt;&lt;br /&gt;More nuggets from IDUG in Tampa to come in other posts. Gotta go now. I'm BUSY.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-369496412012032324?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/369496412012032324/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/05/nuggets-from-db2-by-bay-part-1.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/369496412012032324'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/369496412012032324'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/05/nuggets-from-db2-by-bay-part-1.html' title='Nuggets from DB2 by the Bay, Part 1'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-1445902440676281203</id><published>2010-04-28T18:31:00.001-07:00</published><updated>2010-04-28T19:21:30.872-07:00</updated><title type='text'>This blog has moved</title><content type='html'>&lt;br /&gt;       This blog is now located at http://catterallconsulting.blogspot.com/.&lt;br /&gt;       You will be automatically redirected in 30 seconds, or you may click &lt;a href='http://catterallconsulting.blogspot.com/'&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;       For feed subscribers, please update your feed subscriptions to&lt;br /&gt;       http://catterallconsulting.blogspot.com/feeds/posts/default.&lt;br /&gt;  &lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-1445902440676281203?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://catterallconsulting.blogspot.com/' title='This blog has moved'/><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/1445902440676281203/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/04/this-blog-has-moved.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1445902440676281203'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1445902440676281203'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/04/this-blog-has-moved.html' title='This blog has moved'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-1980670579226449235</id><published>2010-04-28T08:33:00.000-07:00</published><updated>2011-09-20T09:12:54.498-07:00</updated><title type='text'>Using DB2 Stored Procedures, Part 3: Schema Visibility</title><content type='html'>&lt;span style="font-family:arial;"&gt;This is the third of a three-part entry on various factors you could consider in determining how (or whether) DB2 stored procedure technology might be employed in your organization's application environment. In &lt;a href="http://catterallconsulting.blogspot.com/2010/04/using-db2-stored-procedures-part-1.html"&gt;part one&lt;/a&gt; I described some advantages associated with client-side and server-side SQL and pointed out that, if you like your SQL on the database server side of things (and I do), DB2 stored procedures can help you go that route. In &lt;a href="http://catterallconsulting.blogspot.com/2010/04/using-db2-stored-procedures-part-2.html"&gt;part two&lt;/a&gt;, I examined stored procedure usage from the static SQL angle, noting that stored procedures provide a way for client-side programs to easily -- and dynamically, if needs be -- invoke static SQL statements. In this final entry in the series, I'll look at stored procedures as a means of limiting database schema visibility in an application development sense.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;"Schema" clarified.&lt;/span&gt; In a strictly DB2 context, the term "schema" essentially refers to a categorizing mechanism -- a way to logically set apart a set of set of database objects through the use of a common high-level qualifier in the objects' names. So, for example, I could have a REGION4 schema that might include tables, views, and stored procedures -- all with a high-level qualifier of REGION4 (table REGION4.SALES, stored procedure REGION4.NEW_ORDER, etc.). Schemas are handy because they allow me to code SQL statements with unqualified object names and to use these statements with a particular set of objects by supplying the schema name at package bind time (for static SQL) or via the CURRENT SCHEMA special register (for dynamic SQL).&lt;br /&gt;&lt;br /&gt;With that said, this is NOT the way I'm using "schema" in this blog entry. Instead, I'm using the term in the more general sense, as a reference to the design of a database: its tables, columns, relationships, etc. There was a time, not so long ago, when people developing application programs that would retrieve or change DB2-managed data had to know a good bit about the design of the target database. In recent years, application architecture has shifted from monolithic to layered and service-oriented, and today's applications often have loosely-coupled tiers that broadly concern user interface (UI), business logic, and data access functionality. In such an application environment, you should think about how "high up," within an application's functional layers, database schema visibility should extend. To put it another way: at what levels within the functional stack should programmers be require to have knowledge of the underlying database design?&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Who should know what?&lt;/span&gt; This isn't a question of turf I'm talking about -- it has nothing to do with database people saying, "Hey, developers! That's MY knowledge, and I'm not going to share it! Nyah, nyah!" Some of the more vocal advocates of database schema abstraction are application architects. I recall overhearing a top-notch programmer complaining to a colleague that visibility of the database schema extended all the way to the UI layer of an application in development. It wasn't that he couldn't deal with this -- he had excellent SQL skills, and he knew a lot about database design himself. The crux of this guy's argument was that he shouldn't HAVE to be concerned with details of the database design when coding at a higher level of the application.&lt;br /&gt;&lt;br /&gt;That programmer is not a whiner. He has a good point. For many application architects, the database schema is "plumbing" about which most developers shouldn't be concerned. If coders can retrieve, manipulate, and persist data without having to know a lot about the design of the underlying relational database, they can focus more of their attention on providing functionality needed by the business and by application users, thereby boosting their productivity. DB2 stored procedures can facilitate database schema abstraction: they provide data access services needed by (for example) business-layer application programmers, and they limit the requirement for database design knowledge to the people who develop the stored procedures. [I should point out here that some people consider stored procedures themselves to be part of the schema of a database, and that, as such, they should also be abstracted from the perspective of a consumer of stored procedure-provided services. These folks would argue that stored procedures should be invoked by higher-level programs not through SQL CALLs, but by way of something like Web services calls. A DB2 stored procedure can indeed be exposed as a Web service, and going with this approach would involve weighing the benefit of more complete schema abstraction against the additional path length introduced by the Web services call on top of the SQL CALL for stored procedure invocation.]&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;A benefit for DBAs, too.&lt;/span&gt; When business-logic developers code to the schema of a database, the flexibility that DBAs have to effect performance-enhancing database design changes can be severely limited. I was once involved in a discussion among some DB2 DBAs about a database design change that had the potential to significantly improve the CPU cost-efficiency of a key application process. The proposed modification was ultimately shelved because one of the DB2 tables that would be redesigned was accessed by something like 1000 different application programs, and changing all that code was not a feasible proposition. When stored procedures provide access to the database, schema changes tend to be more do-able -- yes, code in affected stored procedures has to be altered, but the scope of this effort will often be less than that which would be faced if business-logic programs directly referenced database tables. If a given stored procedure services 10 business-logic programs, changing the one stored procedure to accommodate a database schema change looks better to me than modifying the 10 business-logic programs that would have to be updated were the stored procedure not serving their data access needs.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;And don't forget security.&lt;/span&gt; Database schema information -- table names, column names, entity relationships -- can be of value to someone who wants to gain access to your organization's data in an unauthorized manner. The more limited the dissemination of database design details, the less likely it is that this information will be used for malicious purposes. If your DB2 database is accessed by way of stored procedures, only stored procedure developers need detailed knowledge of the database schema. It's a good risk mitigation move.&lt;br /&gt;&lt;br /&gt;So, there's my reason number three for using DB2 stored procedures: they enable database schema abstraction, which 1) allows business-logic and UI developers to focus less on application "plumbing" and more on application functionality, 2) provides DBAs with more flexibility in terms of implementing performance-improving database design changes, and 3) limits the spread of detailed database design information that could potentially be used by someone with ill intent. I hope that you'll find a reason to put DB2 stored procedures to work at your organization.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-1980670579226449235?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/1980670579226449235/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/04/using-db2-stored-procedures-part-3.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1980670579226449235'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1980670579226449235'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/04/using-db2-stored-procedures-part-3.html' title='Using DB2 Stored Procedures, Part 3: Schema Visibility'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-2093121406295697728</id><published>2010-04-19T19:39:00.000-07:00</published><updated>2011-09-20T09:10:18.364-07:00</updated><title type='text'>Using DB2 Stored Procedures, Part 2: The Static SQL Angle</title><content type='html'>&lt;span style="font-family:arial;"&gt;This is the second of a three-part entry on various factors you could consider in determining how (or whether) DB2 stored procedure technology might be employed in your organization's application environment. In &lt;a href="http://catterallconsulting.blogspot.com/2010/04/using-db2-stored-procedures-part-1.html"&gt;part one&lt;/a&gt; I described some advantages associated with client-side and server-side SQL and pointed out that, if you like your SQL on the database server side of things (and I do), DB2 stored procedures provide a very useful means of going that route. In this entry I'll examine stored procedure usage in a static SQL context. In part three, I'll look at how your use of stored procedures might be influenced by your desire for more, or less, database schema visibility from an application development perspective.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The pros of static SQL. &lt;/span&gt;Static SQL is, as far as I know, an exclusively DB2 concept, so if you're kind of new to DB2 you might want to check out &lt;a href="http://catterallconsulting.blogspot.com/2010/01/some-basic-information-about-sql-in-db2.html"&gt;the blog entry that I posted a few weeks ago&lt;/a&gt;, on the basics of SQL (including static SQL) in DB2-accessing programs. Those of you who are familiar with static SQL probably know that it delivers two important advantages versus dynamic SQL:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;Better performance&lt;/span&gt; -- In particular, I'm talking about the database server CPU consumption aspect of application performance. The CPU efficiency advantage delivered by static SQL is, of course, based on the fact that preparation of static SQL statements (referring to preparation for execution by DB2, which includes things such as data access path selection by the DB2 optimizer) is accomplished before the statements are ever issued for execution by the associated application program. This is important because for some simple, quick-running SQL statements, the CPU cost of statement preparation can be a several times the cost of statement execution. Can dynamic statement caching reduce the database-server-CPU-cost gap between static and dynamic SQL? Yes, but even if you have repeated execution of parameterized dynamic SQL statements (so as to have a high "hit ratio" in the prepared statement cache), generally the best you can hope for is database server CPU consumption that approaches -- but doesn't match -- the CPU cost of an equivalent static SQL workload (the effectiveness of dynamic statement caching gets a boost with DB2 10 for z/OS, which provides for better prepared statement cache matching for dynamic SQL statements that include literal values -- versus parameter markers -- in query predicates).&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;More-robust security&lt;/span&gt; -- For a dynamic SQL statement to execute successfully, the authorization ID of the application process issuing the statement has to have read (for queries) and/or data-change privileges on the tables (and/or views) named in the statement. Such is not the case for a static SQL statement. All that is needed, authorization-wise, for successful execution of a static SQL statement is the execute privilege on the statement-issuing program's DB2 package.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;Now, I'll concede that there are times when dynamic SQL is a very good choice for an application program. A developer at one of my clients had to write a program that would retrieve information from a DB2 database according to a combination of search arguments that a user could enter on a screen. There were quite a few search options, and coding a static DECLARE CURSOR statement for every possible combination would have been a major piece of work. The developer opted instead to have his program build the SELECT statement dynamically, based on the choices entered by the user. This was a good choice, as it enabled him to deliver the needed functionality in a much more timely manner than would have been the case had he opted for the static SQL approach. Still, with exceptions such as this one duly noted, my preference is to go with static SQL in most cases, especially in OLTP and batch application environments (dynamic SQL often dominates in a data warehouse system, and that's OK).&lt;br /&gt;&lt;br /&gt;Can static SQL statements be issued from client-side application programs? The answer to that question depends on the programming language used. If a client-side program is written in assembler, C, C++, COBOL, Fortran, PL/I or REXX, it can issue static SQL statements (for assembler and PL/I, the client platform has to be a mainframe). If the language is Java, static SQL can be used by way of SQLJ. Another Java option for static SQL is IBM's Optim pureQuery Runtime product, which can also enable the use of static SQL when client-side programs are written in VB.NET or C#. If the language used on the client side is Perl, Python, PHP, or Ruby, the only possibility is dynamic SQL.&lt;br /&gt;&lt;br /&gt;This is where DB2 stored procedures come in. Suppose you want to use static SQL for your application, but you want to use, say, Ruby on the client side? Or, suppose you're using Java on the client side, but your Java developers don't want to use SQLJ (and you don't have Optim pureQuery Runtime)? Well, how about putting the static SQL in DB2 stored procedures, and having your client-side programs invoke these by way of dynamic CALLs? You get the performance and security benefits of static SQL, and your client-side developers get to use their language (and data-access interface) of choice. Everybody's happy!&lt;br /&gt;&lt;br /&gt;That's reason number two for using DB2 stored procedures: they provide a means whereby static SQL can be easily -- and dynamically, if needs be -- invoked by client-side programs. Stop by the blog again soon for my reason number three.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-2093121406295697728?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/2093121406295697728/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/04/using-db2-stored-procedures-part-2.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2093121406295697728'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2093121406295697728'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/04/using-db2-stored-procedures-part-2.html' title='Using DB2 Stored Procedures, Part 2: The Static SQL Angle'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-8141173279718113035</id><published>2010-04-16T20:02:00.000-07:00</published><updated>2010-04-18T21:50:52.423-07:00</updated><title type='text'>Looking Forward to DB2 by the Bay</title><content type='html'>&lt;span style="font-family:arial;"&gt;That would be Tampa Bay, and I'm talking about the International DB2 Users Group North America Conference that will be held May 10 - 14 at the Tampa Convention Center (you can get all the details at &lt;a href="http://www.idug.org/db2-north-american-conference/idug-2010-north-america.html"&gt;IDUG's Web site&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;I've participated in every IDUG North America conference since 1997, and I've always found these events to be very well worth my time and dollars. I'll be delivering a presentation on &lt;a href="http://www.idug.org/ocs/index.php/NA10/NA10/paper/view/784"&gt;mainframe DB2 data warehousing&lt;/a&gt; at this year's conference, and I'll also participate, along with Paul Zikopoulos of IBM's DB2 for Linux/UNIX/Windows (LUW) development organization, in a &lt;a href="http://www.idug.org/ocs/index.php/NA10/NA10/paper/view/803"&gt;"Face 2 Face"&lt;/a&gt; interactive-discussion session on DB2 Data Warehousing (such sessions were known in the past as SIGs, or Special-Interest Groups).&lt;br /&gt;&lt;br /&gt;Why do I attend the IDUG North American Conference every year? There are lots of reasons. For one thing, I really enjoy hearing the latest about DB2 technology from folks in the IBM DB2 development organization -- people like Curt Cotner, IBM Fellow and DB2 Chief Technology Officer; Terry Purcell, Mr. DB2 for z/OS Optimizer; Jeff Josten, from whom I've learned so much about DB2 data sharing; John Campbell, who brings a wealth of lessons learned working with early implementers of new DB2 for z/OS releases; Guy Lohman, a Big Thinker (and doer) from IBM's Almaden Research Center; Matt Huras, Chief Architect of DB2 for LUW; Chris Eaton, who always delivers a ton of DB2 for LUW "news you can use"; and Leon Katsnelson, DB2 for LUW jock and cloud computing savant (you can see the whole &lt;a href="http://www.idug.org/ocs/index.php/NA10/NA10/schedConf/schedule#schedule"&gt;conference schedule&lt;/a&gt; on the IDUG Web site).&lt;br /&gt;&lt;br /&gt;In addition to learning lots from IBM's DB2 top guns, I get a boatload of great information from fellow DB2 consultants who present at the conference: Bonnie Baker, Dave Beulke, Sheryl Larsen, Susan Lawson, Dan Luksetich, and Fred Sobotka, just to name a few. Also among the speakers are professional DB2 instructors like Themis's David Simpson, and technical experts from leading vendors of DB2 tools, such as Phil Grainger from Cogito, Steen Rasmussen from CA, and Rick Weaver from BMC.&lt;br /&gt;&lt;br /&gt;And then you have the user presentations. These are what really make IDUG a special conference. You can't beat the in-the-trenches experiences shared by Dave Churn of DST Systems, Rob Crane of FedEx, John Mallonee of Highmark, Bernie O'Connor of Anixter, Bryan Paulsen of John Deere, Billy Sundarrajan of Fifth Third Bank, and others who work where the DB2 rubber meets the road.&lt;br /&gt;&lt;br /&gt;Of course, I also learn plenty thanks to encounters in the "coffee track" -- a reference to the refreshment and meal breaks during which I'm likely to get in on conversations amongst people who are dealing with challenges and issues that are of particular interest to me. These networking opportunities alone are almost worth the price of admission. The learning continues in the exhibitors hall, where I can catch up on the latest DB2-related offerings from a wide variety of vendors (and pick up a few t-shirts, pens, and notebooks, to boot).&lt;br /&gt;&lt;br /&gt;As if all that weren't enough, the IDUG North American Conference is a great place to take a DB2 certification exam (free for attendees) and get some hands-on at any of several lab sessions.&lt;br /&gt;&lt;br /&gt;The location's a winner, too. IDUG negotiated a special rate at the Marriott Waterside Hotel, right next to the convention center, and some of those discounted rooms are, I think, &lt;a href="http://www.marriott.com/hotels/travel/tpamc-tampa-marriott-waterside-hotel-and-marina/?toDate=5/16/10&amp;amp;groupCode=iduidua&amp;amp;fromDate=5/7/10&amp;amp;app=resvlink"&gt;still available&lt;/a&gt;. Tampa's a great town. Personally, I like to get up in the morning and take a run along Bayshore Boulevard, just outside of downtown. Here you'll find the world's longest continuous sidewalk (six miles), so there are no worries about cars -- just cruise along with the waters of Tampa Bay on your left (if outbound) and the beautiful historic homes of the Hyde Park neighborhood on your right. There's plenty of nightlife in Ybor City, a short streetcar ride away. Great Cuban food, fresh seafood, palm trees -- all this, and a great conference, too? Who wouldn't want to be there? I hope some of you will be able to attend. Find me and say hi if you do.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-8141173279718113035?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/8141173279718113035/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/04/looking-forward-to-db2-by-bay.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/8141173279718113035'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/8141173279718113035'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/04/looking-forward-to-db2-by-bay.html' title='Looking Forward to DB2 by the Bay'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-3881878506979243896</id><published>2010-04-11T19:10:00.000-07:00</published><updated>2010-04-11T19:27:10.305-07:00</updated><title type='text'>Using DB2 Stored Procedures, Part 1: Where Do You Want Your SQL?</title><content type='html'>&lt;span style="font-family: arial;"&gt;Over the past few weeks, I've had a number of interesting discussions with people from various organizations on the subject of DB2 stored procedures. These conversations have generally involved questions pertaining to the use of stored procedures in a DB2 environment. The question, "How should we use DB2 stored procedures?" will be answered differently by different organizations, based on their particular requirements and priorities. That said, their are some&lt;br /&gt;factors that ought to be considered in any review of DB2 stored procedure utilization. I'll cover these factors in a three-part entry beginning with this post, which will focus on the issue of client-side versus server-side SQL ("server" here referring to a database server, versus an application server). In part two I'll take a look at static versus dynamic SQL, and in part three I'll examine the issue of database schema visibility.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Some advantages of client-side SQL:&lt;/span&gt; I've heard various opinions regarding client-side versus server-side SQL (and from the DB2 data server perspective, "client-side" will usually refer to programs, running in an application server, that access a DB2 database). Proponents of having SQL data manipulation language, or DML, statements (i.e., SELECT, INSERT, UPDATE, and DELETE) issued by client-side programs cite, among other things, these benefits:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;Data-consuming and data-retrieval programs can be implemented in a single, client-side deployment&lt;/span&gt; -- This as opposed to a client-side deployment of data-consuming programs and a server-side deployment of associated data retrieval programs. A "one-side" deployment can indeed require less in the way of coordination, with maybe one programming team involved, versus two, and one platform team -- of application server administrators -- with primary responsibility for system management (though it's hoped that DB2 DBAs would be in the loop, as well).&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;Data-consuming and data-retrieval programs are likely to be written in the same language.&lt;/span&gt; It's true that client-side and server-side programs are often coded using different languages, especially when the application and database servers are running on different platforms (it's common, for example, to have sever-side SQL embedded in COBOL programs when the DB2 server is a mainframe, while data-consuming programs might be written in Java and might run on a Linux-based application server). When different programming languages are used on client and data-server platforms, communication between the respective development groups can be a little more challenging than it otherwise would be (particularly when an object-oriented language is used on the one side and a procedural language on the other).&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;Server-side SQL pluses:&lt;/span&gt; People who like their SQL on the server side favor this approach for several reasons, including the following:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;It makes it easy to use static instead of dynamic SQL.&lt;/span&gt; I'll cover static SQL and its connection with stored procedures in part two of this three-part entry. For now, I'll point out that, compared to dynamic SQL, static SQL generally delivers better performance (especially in terms of database server CPU time) and provides for more robust security (an authorization ID does not have to be granted access privileges on database tables in order to execute static SQL statements). Can static SQL be issued from client-side programs? Of course, but it's not always a straightforward matter, as I'll point out in my part-two post.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;It can increase the degree to which data-access code is reused.&lt;/span&gt; When data-access code is packaged in server-side programs, it's generally pretty easy to invoke that code from data-consuming programs written in a variety of languages and running on a variety of platforms (including the platform on which the DB2 data server is running). Is this technically possible when the data-access code is running on an application server? Yes, but -- as is the case regarding the use of static versus dynamic SQL -- I believe that data-access code reuse is a more straightforward proposition when that code runs on the data server.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;It makes it easier to utilize developers with different programming language skills for the same application project.&lt;/span&gt; Admittedly, this is only an advantage if your organization HAS groups of developers with expertise in different programming languages and WANTS to use them for the same project. If that is indeed your situation, server-side SQL could be the ticket for you. It's quite common for data-consuming programs written in languages such as Java and C# to invoke data-access programs written in COBOL or SQL (the latter being a reference to SQL stored procedures).&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;It makes it easier to leverage the talents of skilled SQL coders.&lt;/span&gt; SQL becomes more and more rich with each release of DB2 (one indicator being the number of pages in the DB2 SQL Reference). That's a good thing, but it also means that SQL mastery becomes a taller order as time goes by. I've seen in recent years the emergence of professionals who consider themselves to be SQL programmers (this as opposed to, say, Java programmers who know SQL). I feel that server-side SQL facilitates the leveraging of individuals who have top-notch SQL skills.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;It improves CPU efficiency for transactions that issue multiple SQL DML statements.&lt;/span&gt; When SQL DML statements are issued from client-side programs, there are network send and receive operations associated with each such statements. These network trips increase application overhead when transactions issue multiple SQL DML statements -- overhead that is reduced when a client-side program invokes a server-side program that issues the multiple SQL DML statements locally to the DB2 server.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;The stored procedure angle:&lt;/span&gt; If you like your SQL on the DB2 server side of an application (I do), stored procedures provide a great way to go this route. They can be written in a number of languages (including, as mentioned, SQL), and they come with a nice familiarity factor, in that the stored procedure pattern is well-known to many client-side application developers, including those who have worked with DBMSs other than DB2. Note that a DB2 stored procedure&lt;br /&gt;program can declare and open a cursor in such a way that the result set rows can be fetched by the calling client program -- a useful feature for set-level data retrieval operations. [Mainframers should be aware that native SQL procedures, introduced with DB2 9 for z/OS, can run substantially on zIIP engines when called by DRDA clients through the DB2 distributed data facility -- a prime opportunity to put cost-effective zIIP MIPS to good use.] &lt;br /&gt;&lt;br /&gt;So, reason number one for using DB2 stored procedures: they are an excellent choice for the packaging of SQL statements in database server-side programs. My reasons two and three for using DB2 stored procedures will be explained in parts two and three of this three-part blog entry.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-3881878506979243896?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/3881878506979243896/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/04/using-db2-stored-procedures-part-1.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3881878506979243896'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3881878506979243896'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/04/using-db2-stored-procedures-part-1.html' title='Using DB2 Stored Procedures, Part 1: Where Do You Want Your SQL?'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-1831289072413071054</id><published>2010-03-25T14:34:00.000-07:00</published><updated>2011-09-20T09:06:10.023-07:00</updated><title type='text'>A Closer Look at DB2 9 for z/OS Index Compression</title><content type='html'>&lt;span style="font-family:arial;"&gt;In May of last year I posted &lt;a href="http://catterallconsulting.blogspot.com/2009/05/much-ado-about-db2-indexes-part-1.html"&gt;a blog entry&lt;/a&gt; that included some information about the index compression capability introduced with DB2 9 for z/OS. It's a good time, I think, to add to that information, and I'll do that by way of this post.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;How does DB2 do it?&lt;/span&gt; In that entry from last year, I noted that DB2 9 index compression is not dictionary-based, as is DB2 data compression (with dictionary-based compression, commonly occurring strings of data values are replaced with shorter strings, and this replacement is reversed when the data is accessed). For a tablespace defined with COMPRESS YES, DB2 will place as many compressed data rows as it can into a 4K page in memory (or an 8K or 16K or 32K page, depending on the buffer pool to which the tablespace is assigned), and the page size in memory is the same as the page size on disk. Index compression reduces space requirements on disk but not in memory: the size of a leaf page in a compressed index will be smaller on disk than in memory (only leaf pages are compressed, but the vast majority of most indexes' pages are leaf pages). Index compression is based on getting the contents of an 8K or 16K or 32K index leaf page in memory into a 4K page on disk, without using a dictionary (an index has to be assigned to an 8K or 16K or 32K buffer pool in order to be compressed). To do this, DB2 uses a combination of three compression mechanisms:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;Prefix compression&lt;/span&gt;: Suppose you had a 3-column key on state, city, and telephone number. You might then have a LOT of duplicates of some combinations of column 1 and column 2 values (e.g., state = 'Texas' and city = 'Houston'). If compression is used for this index, DB2 will not repeatedly store those duplicate “prefix” values in leaf pages on disk; instead, DB2 will store a given key prefix once in a compressed leaf page on disk, and store along with that prefix the part of the whole-key key value that is different from one entry to the next (e.g., phone number = '713-111-2222', phone number = '713-222-3333', etc.). Note that while this example (for the sake of simplicity) presents a prefix that breaks along key-column lines, this is not a restriction. In other words, a prefix, in the context of prefix compression, can include just a portion of a key column value (for example, 'ROBERT' could be a prefix for the last names 'ROBERTS' and 'ROBERTSON').&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;RID list compression&lt;/span&gt;: If a given index key value has many duplicates, several of these duplicate values could be in rows that are located in the same page of a table. If the index is not compressed, the full RID (row ID) of each of these rows will be stored following the key value in a leaf page on disk, even though only one byte of that four- or five-byte RID (the byte that indicates the row's position in the data page) will be different from one value to the next in the RID chain (the page number, occupying 4 bytes for a partitioned tablespace with a DSSIZE of 4 GB or larger, and 3 bytes otherwise, will stay the same). If that index were to be compressed, DB2 would save space on disk by storing the multi-byte page number once in the RID chain, followed by the single-byte row location indicators, until the page number (or the key value) changes. This compression technique is particularly effective for indexes that&lt;br /&gt;have relatively low cardinality and are either clustering or have a high degree of correlation with the table's clustering key.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;In-memory-only key map&lt;/span&gt;: An uncompressed index leaf page contains a key map, which itself contains a 2-byte entry for each distinct key value stored in the page. If the index is compressed, this map will not be stored on disk (it will be reconstructed, at relatively low cost, when the leaf page is read into memory). This compression technique nicely complements the RID list compression mechanism, as it is most effective for high-cardinality indexes (especially those with short keys, as the more distinct key values a page holds, the more space the key map occupies).&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;These compression techniques often deliver impressive results, with plenty of DB2 9 for z/OS users reporting disk savings of 50-70% after enabling compression for an index. Still, they have their limits, and when DB2 determines that a leaf page in memory already holds as much as can be compressed onto a 4K page on disk, it will stop placing entries in that page, even if that means letting a good bit of space go unused in the in-memory page. This is why you want to run the DSN1COMP utility for an index prior to compressing it. DSN1COMP will provide estimates of disk space savings on the one hand, and in-memory page space wastage on the other hand, that you could expect to see based on your choice of an 8K, 16K, or 32K page size for the to-be-compressed index. The right index page size will be the one that maximizes disk space savings while minimizing in-memory page space wastage.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Index compression overhead: it's about I/Os, not access.&lt;/span&gt; The differences in the way that data and indexes are compressed in a DB2 9 for z/OS environment lead to differences regarding the associated CPU overhead. First of all, data compression is hardware-assisted (it takes advantage of a microcode assist built into the System z server line) while index compression is not. Second, in the case of data compression, the overhead cost is paid when data is accessed in a buffer pool in memory, as rows are not decompressed until they are retrieved by DB2 on behalf of an application process (similarly, new or changed rows are compressed and placed in pages in memory as part of insert and update operations). For a compressed index, the overhead cost is incurred at I/O time, since pages are decompressed when read into memory and compressed when written to disk. So, once a leaf page of a compressed index is in memory, repeated accesses of that page will not involve additional overhead due to compression, whereas data compression overhead is incurred every time a row is retrieved from, or placed into, a page in memory. With respect to the I/O-related cost of compression, the situation is reversed: there is no additional overhead associated with reading a compressed data page into memory from disk, or writing such a page to disk, while for a compressed index the CPU cost of reading a leaf page from disk, or writing a changed leaf page to disk, will be higher than it would be for a non-compressed index. One take-away from this is that large buffer pools are a good match for compressed indexes, as fewer disk I/Os means lower compression overhead.&lt;br /&gt;&lt;br /&gt;This "pay at I/O time" aspect of the CPU cost of index compression has implications for &lt;span style="font-style: italic;"&gt;where&lt;/span&gt; that cost shows up. If the I/O is of the prefetch read variety, or a database write, the leaf page compression cost will be charged to the DB2 database services address space (aka DBM1). If it's a synchronous read I/O, index compression overhead will affect the class 2 CPU time of the application process for which the on-demand read is being performed. Thus, for an application that accesses index leaf pages that are read in from disk via prefetch reads (as might be the case for a batch job or a data warehouse query), the cost of index compression may appear to be close to zero because it's being paid by the DB2 database services address space.&lt;br /&gt;&lt;br /&gt;So, what kind of overhead should you expect to see, in terms of the in-DB2 CPU cost of applications that access compressed indexes? Because of the multiple variables that come into play, mileage will vary, but my expectation is that you'd see in-DB2 CPU consumption that would be higher by some single digit of percent versus the non-compressed case. Remember: keep the I/Os down to keep that cost down.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-1831289072413071054?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/1831289072413071054/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/03/closer-look-at-db2-9-for-zos-index.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1831289072413071054'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1831289072413071054'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/03/closer-look-at-db2-9-for-zos-index.html' title='A Closer Look at DB2 9 for z/OS Index Compression'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-8382250854784646917</id><published>2010-03-17T20:03:00.000-07:00</published><updated>2010-03-17T20:27:14.821-07:00</updated><title type='text'>My Favorite DB2 9 for z/OS BI Feature</title><content type='html'>&lt;span style="font-family:arial;"&gt;A lot of the consulting work that I do relates to the use of DB2 for z/OS as a data server for business intelligence (BI) applications. DB2 9 for z/OS delivered a number of new features and functions that are particularly attractive in a data warehousing context, including index compression, index-on-expression, instead-of triggers, the INTERSECT and EXCEPT set operators, and the OLAP functions RANK, DENSE_RANK, and ROW_NUMBER. That's all great stuff, but my favorite BI-friendly DB2 9 feature -- and the subject of this blog entry -- is global query optimization.&lt;br /&gt;&lt;br /&gt;I'll admit here that I didn't fully appreciate the significance of global query optimization when I first heard about it. There was something about virtual tables, and correlating and de-correlating subqueries, and moving parts of a query's result set generation process around relative to the location of query blocks within an overall query -- interesting, but kind of abstract as far as I was concerned. Time and experience have brought me around to where I am today: a big-time global query optimization advocate, ready to tell mainframe DB2 users everywhere that this is one outstanding piece of database technology. I'll give you some reasons for my enthusiasm momentarily, but before doing that I'd like to explain how global query optimization works.&lt;br /&gt;&lt;br /&gt;Adarsh Pannu, a member of IBM's DB2 for z/OS development team, effectively summed up the essence of global query optimization when he said in a recent presentation that it's about "improved subquery processing." That really is it, in a nutshell, and here's the first of the BI angles pertaining to this story: queries that contain multiple subquery predicates are common in data warehouse environments. These subquery predicates might be in the form of non-correlated in-list subqueries, like this one:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;WHERE T1.C1 IN (SELECT C2 FROM T2 WHERE…)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Alternatively, a subquery predicate might be a match-checking correlated subquery like this one:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;WHERE EXISTS (SELECT 1 FROM T2 WHERE T2.C2 = T1.C1…)&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Prior to Version 9, in evaluating a query containing one or more subqueries DB2 would optimize each SELECT in isolation, without regard to the effect that the access path chosen for a subquery might have on the performance of the query overall. Furthermore, DB2 would handle subquery processing based on the position of the subquery within the overall query -- in other words, if a subquery were a good ways "down" in an overall query, it wouldn't be evaluated "up front" at query execution time. Using this approach, DB2 might well choose the optimal access path for each individual SELECT in a query, but the access path for the query overall could end up being quite sub-optimal in terms of performance.&lt;br /&gt;&lt;br /&gt;DB2 9 takes a different approach when it comes to optimizing a query containing one or more subquery predicates. For one thing, it will evaluate a subquery predicate in the context of the overall query in which the subquery appears. Additionally, DB2 9 can do some very interesting transformative work in optimizing the overall query. It might, for example, change a correlated subquery predicate to a non-correlated subquery, materialize that result set in a work file, and -- treating this materialized result set as a virtual table -- "move” it to a different part of the overall query and join it to another table referenced in the query. Needless to say, such transformations -- correlated subquery to non-correlated subquery (or vice versa), subquery to join -- are always accomplished in such a way as to preserve the result set of the query as initially coded.&lt;br /&gt;&lt;br /&gt;An example can be very useful in showing how all this comes together in the processing of a subquery-containing query in a DB2 9 environment. Suppose you have a query like the one below, in which a correlated subquery predicate has been highlighted in green:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;SELECT * FROM TABLE_1&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 153, 0); font-weight: bold;font-family:courier new;" &gt;WHERE EXISTS&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(0, 153, 0); font-weight: bold;font-family:courier new;" &gt;(SELECT 1 FROM TABLE_2&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(0, 153, 0); font-weight: bold;font-family:courier new;" &gt;WHERE TABLE_1.COL1 = TABLE_2.COL1&lt;/span&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 153, 0); font-weight: bold;font-family:courier new;" &gt;AND TABLE_2.COL2 = 1234&lt;br /&gt;&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;AND…)&lt;/span&gt; &lt;/span&gt;&lt;br /&gt;&lt;br /&gt;The EXPLAIN output for the query, showing the working of global query optimization, might look like this (and here we see just a few of the relevant columns from a PLAN_TABLE):&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.catterallconsulting.com/uploaded_images/Plan_table-702617.bmp"&gt;&lt;img style="display: block; margin: 0px auto 10px; text-align: center; cursor: pointer; width: 400px; height: 161px;" src="http://www.catterallconsulting.com/uploaded_images/Plan_table-702576.bmp" alt="" border="0" /&gt;&lt;/a&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://www.catterallconsulting.com/uploaded_images/Plan_table-756251.bmp"&gt;&lt;br /&gt;&lt;/a&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;Here's what the EXPLAIN output is telling you: the optimizer changed the match-checking correlated subquery highlighted in green to a non-correlated subquery (evidenced by “NCOSUB” in the QBTYPE column of the PLAN_TABLE). This result set (which you’d get if you removed the predicate in the correlated subquery that contains the correlation reference to a column in TABLE_1) is sorted to remove duplicates (this to help ensure that the overall query’s result set will not be changed due to the query transformation) and to get the result set rows ordered so as to boost the performance of a subsequent join step (see the “Y” values under SORTC_UNIQ (shortened to SCU) and SORTC_ORDERBY (SCO) of the PLAN_TABLE). Then, the materialized and sorted subquery result set, identified as “virtual table” DSNWFQB(02), with the “02” referring to the query block from which the result set was generated (the “de-correlated” subquery), is moved to the “top” of the query and nested-loop joined to TABLE_1 (see the “1” under the METHOD (shortened to M) column of the PLAN_TABLE). It all boils down to this: the correlated subquery predicate is used to match rows in TABLE_1 with rows in TABLE_2, and in this case the optimizer has determined that the required row-matching can be accomplished more efficiently with a join versus the correlated subquery. It's a big-picture -- as opposed to a piece-part -- approach to query optimization.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;So, why am I big on query optimization? Reason 1 is the potentially huge improvement in performance that can be realized for a subquery-containing query optimized by DB2 Version 9 versus DB2 Version 8 (I'm talking about an order of magnitude or more improvement in elapsed and CPU time in some cases). Reason 2 is the fact that global query optimization can greatly improve query performance without requiring that a query be rewritten. That's important because BI queries are often generated by tools, and that generally means that query rewrite is not an option. It's possible that some of the complex queries in a data warehouse environment might be generated by application code written by an organization’s developers (or by people contracted to write such code), and if such is the situation you could say that query rewrite (via modification of the query-generating code) is technically possible. Even in that case, though, rewriting a query could be very difficult. In my experience, custom-coded “query-building engines” may construct a query by successively applying various business rules, and this often takes the form of adding successive subquery predicates to the overall query (these might be nested in-list non-correlated subqueries). To “just rewrite” a query built in this way is no small thing, because coding the query with more in the way of joins and less in the way of subqueries would require more of an “all at once” application of business rules versus the more-straightforward successive-application approach. Thankfully, global query optimization will often make query rewrite a non-issue.&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;I'll conclude with this thought: if you have a data warehouse built on DB2 for z/OS Version 8, global query optimization is one of the BEST reasons to migrate your system to DB2 9. If you're already using DB2 9, but not in a BI-related way, global query optimization is a great reason to seriously consider using DB2 9 for data warehousing.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-8382250854784646917?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/8382250854784646917/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/03/my-favorite-db2-9-for-zos-bi-feature.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/8382250854784646917'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/8382250854784646917'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/03/my-favorite-db2-9-for-zos-bi-feature.html' title='My Favorite DB2 9 for z/OS BI Feature'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-4641944880124922266</id><published>2010-03-05T07:27:00.000-08:00</published><updated>2011-09-20T09:03:55.636-07:00</updated><title type='text'>Another Note on DB2 for z/OS Buffer Pool Page-Fixing</title><content type='html'>&lt;span style="font-family:arial;"&gt;In the summer of 2008, I posted &lt;a href="http://catterallconsulting.blogspot.com/2008/07/note-on-db2-for-zos-page-fixed-buffer.html"&gt;a blog entry on page-fixing DB2 buffer pools&lt;/a&gt;, a feature introduced with DB2 for z/OS Version 8. A recent discussion I had with a client about buffer pool page-fixing brought to light two aspects of this performance tuning option that, I believe, are overlooked by some DB2 users. In this post I'll describe how you can make a quick initial assessment as to whether or not the memory resource of a mainframe system is sufficient to support buffer pool page-fixing, and I'll follow that with a look at the "bonus" performance impact that can be realized by buffer pool page-fixing in a DB2 data sharing environment.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Gauging the server memory situation.&lt;/span&gt; As pointed out in the aforementioned 2008 blog entry on the topic, page-fixing a buffer pool can reduce CPU consumption by eliminating the requests that DB2 would otherwise have to make of z/OS to fix in memory -- and to subsequently release -- a buffer for every read of a page from, or write of a page to, the disk subsystem. These page fix/page release operations are individually inexpensive, but the cumulative CPU cost can be significant when the I/Os associated with a pool number in the hundreds (or thousands) per second. The prospect of removing that portion of a DB2 workload's CPU utilization may have you thinking, "Why not?" Well, there's a reason why PGFIX(NO) is the default setting for a DB2 buffer pool, and it has to do with utilization of a mainframe server's (or z/OS LPAR's) memory resource.&lt;br /&gt;&lt;br /&gt;With PGFIX(NO), the real storage page frames occupied by DB2 buffers are candidates for being stolen by z/OS, should the need arise. If something has to be read into memory from disk, and there is no available page frame to accommodate that read-in, z/OS will make one available by moving its contents to a page data set on auxiliary storage (if that relocated page is subsequently referenced by a process, it will be brought back into server memory from auxiliary storage -- this is known as demand paging). z/OS steals page frames according to a least-recently-used algorithm: the longer a page frame goes without being referenced, the closer it moves to the front of the steak queue. If a DB2 buffer goes a long time without being referenced, it could be paged out to auxiliary storage.&lt;br /&gt;&lt;br /&gt;So, page-fixing a buffer pool in memory would preclude z/OS from considering the associated real storage page frames as candidates for stealing. The important question, then, is this: would some of those pages be stolen by z/OS if they weren't fixed in memory from the get-go? If so, then page-fixing that pool's buffers might not be such a great idea: in taking away some page frames that z/OS might otherwise steal, buffer pool page fixing could cause page-steal activity to increase for other subsystems and application processes in the z/OS LPAR. Not good.&lt;br /&gt;&lt;br /&gt;Fortunately, there's a pretty easy way to get a feel for this: using either your DB2 monitor (an online display or a statistics report) or the output of the DB2 command -DISPLAY BUFFERPOOL DETAIL, look for fields labeled "PAGE-INS REQUIRED FOR READ" and "PAGE-INS REQUIRED FOR WRITE" (or something similar to that). What these fields mean: a page-in is required for a read if DB2 wants to read a page from disk into a particular buffer, and that buffer has been paged out to auxiliary storage (i.e., the page frame occupied by the buffer was stolen by z/OS). Similarly, a page-in is required for a write if DB2 needs to write the contents of a buffer to disk and the buffer is in auxiliary storage.&lt;br /&gt;&lt;br /&gt;If, for a pool, the PAGE-INS REQUIRED FOR READ and PAGE-INS REQUIRED FOR WRITE fields both contain zeros, it is likely that the pool, from a memory perspective, is "V=R" anyway (that is to say, the amount of real storage occupied by the pool is probably very close to, if not the same as, its size in terms of virtual storage). In that case, going with PGFIX(YES) should deliver CPU savings without increasing pressure on the server memory resource, since the page frames being stolen are probably not those that are occupied by that pool's buffers. If you want an added measure of assurance on this score, issue a -DISPLAY BUFFERPOOL DETAIL(*) command. The (*) following the DETAIL keyword tells DB2 that you want statistics for the pool since the time it was last allocated. That might have been days, or even weeks, ago (the command output will tell you this), and if you see that the "PAGE-INS REQ" fields in the read and write parts of the command output contain zeros for that long period of time, it's a REALLY good bet that the pool's occupation of real storage won't increase appreciably if you go with PGFIX(YES). For even MORE assurance that the memory resource of the z/OS LPAR in which DB2 is running is not under a lot of pressure, check the "PAGE-INS REQUIRED" numbers for the lower-activity pools (those with fewer GETPAGE requests than others). If even these show zeros, you should be in really good shape, memory-wise.&lt;br /&gt;&lt;br /&gt;With all this said, keep a couple of things in mind. First, even though your "PAGE-INS REQUIRED" numbers may give you a high degree of confidence that going to PGFIX(YES) for a buffer pool would be a good idea, make sure to coordinate this action with your z/OS systems programmer. That person has responsibility for seeing that z/OS system resources (such as server memory) are effectively managed and utilized, and you need to make sure that the two of you are on the same page (no pun intended) regarding buffer pool page-fixing. If you've done your homework, and you let the z/OS systems programmer do his (or her) homework (such as looking at z/OS monitor-generated system paging statistics), getting to agreement should not be a problem. Second, be selective in your use of the PGFIX(YES) buffer pool option. The greater the amount of I/O activity for a pool, the greater the benefit of PGFIX(YES). I'd recommend considering page-fixing for pools for which the rate of disk I/O activity is at least in the high double digits (writes plus reads) per second (and be sure to include prefetch reads when calculating the rate of disk I/O operations for a buffer pool). By staying with PGFIX(NO) for your lower-activity pools, you ensure that DB2 will make some buffer pool-associated page frames available to z/OS for page-out, should something cause the LPAR's memory resource to come under significant pressure. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;And for you data sharing users...&lt;/span&gt; Just a couple of weeks ago, someone told me that he was under the impression that page-fixing buffer pools would have a negative performance impact in a DB2 data sharing environment. NOT SO. Assuming (as mentioned above) that your server memory resource is sufficient to accommodate page-fixing for one or more of your buffer pools, the resulting CPU efficiency benefit should be MORE pronounced for in a data sharing group versus a standalone DB2 system. How so? Simple: the buffer pool page fix/page release activity that occurs for DB2 reads to, and writes from, the disk subsystem with PGFIX(NO) in effect also occurs for writes of pages to, and reads of pages from, coupling facility group buffer pools. Like disk I/Os, page read and write actions involving a group buffer pool can number in the thousands per second. PGFIX(YES) eliminates the overhead of page fix/page release requests for disk I/Os AND for group buffer pool page reads and writes. So, if you're running DB2 in a data sharing configuration, you have another incentive to check out the page-fix option for your high-use buffer pools.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-4641944880124922266?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/4641944880124922266/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/03/another-note-on-db2-for-zos-buffer-pool.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/4641944880124922266'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/4641944880124922266'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/03/another-note-on-db2-for-zos-buffer-pool.html' title='Another Note on DB2 for z/OS Buffer Pool Page-Fixing'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-6465618915726689964</id><published>2010-02-22T08:04:00.000-08:00</published><updated>2010-02-22T08:31:19.395-08:00</updated><title type='text'>A Couple of Notes on DB2 Group Buffer Pools</title><content type='html'>&lt;span style="font-family:arial;"&gt;I have recently done some work related to DB2 for z/OS data sharing, and that has me wanting to share with you a couple of items of information concerning group buffer pools (coupling facility structures used to cache changed pages of tablespaces and indexes that are targets of inter-DB2 read/write interest). First I'll provide some thoughts on group buffer pool sizing. After that, I'll get into the connection between local buffer pool page-fixing and group buffer pool read and write activity. [Lingo alert: GBP is short for group buffer pool, and "GBP-dependent" basically means that there is inter-DB2 read/write interest in a page set (i.e., a tablespace or or an index or a partition).]&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;How do you know if bigger is better?&lt;/span&gt; A lot of folks know that a group buffer pool should be at least large enough to prevent directory entry reclaims (reclaims are basically "steals" of in-use GBP directory entries to accommodate registration of newly, locally cached pages of GBP-dependent page sets, and you want to avoid them because they result in invalidation of "clean" pages cached in local buffer pools). The key to avoiding directory entry reclaims is to have enough directory entries in a GBP to register all the different pages that could be cached in the GBP and in the associated local buffer pools at any one time (you also want to make sure that there are no GBP write failures due to lack of storage, but there won't be if the GBPs are large enough to prevent directory entry reclaims). For a GBP associated with a 4K buffer pool, and with the default 5:1 ratio of directory entries to data entries, sizing to prevent directory entry reclaims is pretty simple: you add up the size of the local pools and divide that figure by three to get your group buffer pool size; so, if there are two members in a data sharing group, and if BP1 has 6000 buffers on each member, directory entry reclaims will not occur if the size of GBP1 is at least 16,000 KB (the size of BP1 on each of the two DB2 members is 6000 X 4 KB = 24,000 KB, so the GBP1 size should be at least (2 X 24,000 KB) / 3, which is 16,000 KB). Let's say that your GBPs are all large enough to prevent directory entry reclaims (you can check on this via the output of the DB2 command -DISPLAY GROUPBUFFERPOOL GDETAIL). If you have enough memory in your coupling facility LPARs to make them larger still, should you? If you do enlarge them, how do you know if you've done any good?&lt;br /&gt;&lt;br /&gt;Start by checking on the success rate for GBP reads caused by buffer invalidations (when a local buffer of DB2 member X holds a table or index page that is changed by a process running on DB2 member Y, the buffer in member X's local pool will be marked invalid and a subsequent request for that page will cause member X to request the current version of the page, first from the GBP and then, in case of a "not found" result, from the disk subsystem). Information about these GBP reads can be found in a DB2 monitor report or online display of GBP information, or in the output of a -DISPLAY GROUPBUFFERPOOL MDETAIL command. In a DB2 monitor report the fields of interest may be labeled as follows (field names can vary slightly from one monitor product to another -- note that "XI" is short for "cross-invalidation," which refers to buffer invalidation operations):&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;GROUP BP1&lt;span style="color: rgb(255, 255, 255);"&gt;..........................&lt;/span&gt;QUANTITY&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;---------------------------&lt;span style="color: rgb(255, 255, 255);"&gt;........&lt;/span&gt;--------&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;SYN.READS(XI)-DATA RETURNED&lt;span style="color: rgb(255, 255, 255);"&gt;............&lt;/span&gt;8000&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;SYN.READS(XI)-NO DATA RETURN&lt;span style="color: rgb(255, 255, 255);"&gt;...........&lt;/span&gt;2000&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;In -DISPLAY GROUPBUFFERPOOL MDETAIL output, you'd be looking for this:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:courier new;"&gt;DSNB773I - MEMBER DETAIL STATISTICS&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;&lt;span style="color: rgb(255, 255, 255);"&gt;.............&lt;/span&gt;SYNCHRONOUS READS&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;&lt;span style="color: rgb(255, 255, 255);"&gt;...............&lt;/span&gt;DUE TO BUFFER INVALIDATION&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;&lt;span style="color: rgb(255, 255, 255);"&gt;.................&lt;/span&gt;DATA RETURNED&lt;span style="color: rgb(255, 255, 255);"&gt;..................&lt;/span&gt;= 8000&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:courier new;"&gt;&lt;span style="color: rgb(255, 255, 255);"&gt;.................&lt;/span&gt;DATA NOT RETURNED&lt;span style="color: rgb(255, 255, 255);"&gt;..............&lt;/span&gt;= 2000&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;The success rate, or "hit rate," for these GBP reads would be:&lt;br /&gt;&lt;br /&gt;&lt;span style="color: rgb(0, 153, 0);"&gt;(reads with data returned) / ((reads with data returned) + (reads with data not returned))&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Using the numbers from the example output above, the success rate for GBP reads due to buffer invalidation would be 8000 / (8000 + 2000) = 80%.&lt;br /&gt;&lt;br /&gt;Here's why this ratio is useful: buffer invalidations occur when a GBP directory entry pointing to a buffer is reclaimed (not good, as previously mentioned), or when a page cached locally in one DB2 member's buffer pool is changed by a process running on another member of the data sharing group (these invalidations are good, in that they are required for the preservation of data coherency in a data sharing environment). If you don't have any buffer invalidations resulting from directory entry reclaims, invalidations are occurring because of page update activity. Because updated pages of GBP-dependent pages sets are written to the associated GBP as part of commit processing, a DB2 member looking for an updated page in a GBP should have a reasonably good shot at finding it there, if the GBP is large enough to provide a decent page residency time.&lt;br /&gt;&lt;br /&gt;So, if you make a GBP bigger and you see that the hit ratio for GBP reads due to invalid buffer has gone up for the member DB2 subsystems, you've probably helped yourself out, performance-wise, because GBP checks for current versions of updated pages are more often resulting in "page found" situations. Getting a page from disk is fast, but getting it from the GBP is 2 orders of magnitude faster (3 orders of magnitude if you have to get the page from spinning disk versus disk controller cache).&lt;br /&gt;&lt;br /&gt;By the way, the hit ratio for GBP reads due to "page not in buffer pool" (labeled as such in -DISPLAY GROUPBUFFERPOOL MDETAIL output, and as something like SYN.READS(NF) in a DB2 monitor report or display) is not so useful in terms of gauging the effect of a GBP size increase. These numbers reflect GBP reads that occur when DB2 member is looking in the GBP for a page it needs and which it doesn't have in a local buffer pool. This has to be done prior to requesting the page from disk if the target page set is GBP-dependent, but a GBP "hit" for such a read is, generally speaking, not very likely.&lt;br /&gt;&lt;br /&gt;One more thing: if you make a GBP bigger and you are duplexing your GBPs (and I hope that you are), be sure to enlarge the secondary GBP along with the primary GBP. If you aren't duplexing your GBPs (and why is that?), make sure that all your structures can still fit in one CF LPAR (in a two-CF configuration) after the target GBP has been made larger.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Buffer pool page-fixing: good for more than disk I/Os. &lt;/span&gt;Buffer pool page-fixing, introduced with DB2 for z/OS V8, is one of my favorite recent DB2 enhancements (I blogged about it in an &lt;a href="http://www.catterallconsulting.com/2008/07/note-on-db2-for-zos-page-fixed-buffer.html"&gt;entry posted in 2008&lt;/a&gt;). People tend to think of the performance benefit of buffer pool page-fixing as it relates to disk I/O activity. That benefit is definitely there, but so is the benefit -- and this is what lots of people don't think about -- associated with GBP read and write activity. See, every time DB2 writes a page to a GBP or reads a page from a GBP, the local buffer involved in the operation must be fixed in server memory (aka central storage). If the buffer is in a pool for which PGFIX(YES) has been specified, that's already been done; otherwise, DB2 will have to tell z/OS to fix the buffer in memory during the GBP read or write operation and then release the buffer afterwards. A single "fix" or "un-fix" request is inexpensive, CPU-wise, but there can be hundreds of page reads and writes per second for a GBP, and the cumulative cost of all that buffer fixing and un-fixing can end up being rather significant. So, if you are running DB2 in data sharing mode and you aren't yet taking advantage of buffer pool page-fixing, now you have another reason to give it serious consideration.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-6465618915726689964?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/6465618915726689964/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/02/couple-of-notes-on-db2-group-buffer.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6465618915726689964'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6465618915726689964'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/02/couple-of-notes-on-db2-group-buffer.html' title='A Couple of Notes on DB2 Group Buffer Pools'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-7985844551903064029</id><published>2010-02-09T13:36:00.000-08:00</published><updated>2010-02-09T13:50:59.816-08:00</updated><title type='text'>Good News on the Mainframe DB2 Data Warehousing Front</title><content type='html'>&lt;span style="font-family: arial;"&gt;Last week, I attended a 1-day IBM System z "Technology Summit" education event in Atlanta. It was a multi-track program, and the DB2 for z/OS track ("Track 2") was excellent, in terms of both content and quality of presentation delivery (and it was FREE -- check out the remaining North American cities and dates for this event at http://www-01.ibm.com/software/os/systemz/summit/). The first presentation of the day, delivered by Jim Reed of IBM's Information Management software organization, focused on mainframe market trends and IBM's DB2 for z/OS product strategy. Jim's talk contained several nuggets of information that underscore the solid present and bright future of DB2 for z/OS as a platform for business intelligence applications. In this post, I'll share this information with you, along with some of my own observations on the topic.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;BI important? How about most important?&lt;/span&gt; Jim started out with a reference to a recent IBM Global CIO survey which asked participants to identify their top priority. You know what's hot in IT circles these days: virtualization, mobility apps, regulatory compliance. So, what came out on top with regard to CIO priorities? Analytics and business intelligence. That's not very surprising, as far as I'm concerned. Having spent years on optimizing efficiency, squeezing costs out of every facet of their operations, organizations are increasingly focused on optimizing performance. Are they offering the right mix of products and services to their customers? Are they selling to the right people? Are they delivering value in a way that separates them, in the eyes of their customers, from their competitors? Data warehouse systems are key drivers of success here, enabling companies to generate actionable intelligence from their data assets (and the breadth of these data assets keeps expanding, including now not just traditional point-of-sale and other business transactions, but e-mails, customer care interactions -- even company- and product-related comments posted on external-to-the-enterprise Web sites).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Big iron has big mo.&lt;/span&gt; At the same time that BI is heating up as an area of corporate endeavor, the mainframe -- long seen as a workhorse for run-the-business OLTP and batch workloads -- is growing in popularity as a platform for BI applications. Jim spoke of several factors that are putting wind in System z's sails with regard to data warehousing. He cited a Gartner report that spotlighted key BI issues with which companies are grappling now. This list of front-burner concerns included:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;High availability&lt;/span&gt;. Data warehouses are more likely these days to get a "mission critical" designation. Many (including one I worked on just last month) are customer-facing systems, and a lot of those are the subject of service level agreements.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;Mixed workload performance&lt;/span&gt;. This was identified as the number one performance issue for data warehouses. Mixed BI workloads, in which fast-running, OLTP-like queries vie with complex, data-intensive analytic processes for system resources, are becoming common as so-called "operational BI" gains prominence.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;Then, of course, there's the matter of data protection, on which so much else depends. Jim mentioned that 33% of people recently surveyed indicated that they would QUIT doing business with a company if that company experienced a data security or privacy breach and was seen as being responsible for the incident.&lt;br /&gt;&lt;br /&gt;So, to address these key issues, you'd probably want to build your data warehouse on a hardware/software platform known for high availability, sophisticated workload management capabilities, and strong, multi-layered data protection and access control. Hmmm. Sounds like mainframe DB2 to me. Keep in mind, too, that the well-known availability and workload management strengths of System z and z/OS and DB2 are made even stronger when DB2 is deployed in data sharing mode on a parallel sysplex mainframe cluster configuration.&lt;br /&gt;&lt;br /&gt;Oh, and let's not forget that the legendary reliability of mainframe systems is not just a matter of advanced hardware and software technology (good as that stuff is) -- it also reflects the deep skills and robust processes (around change management, performance monitoring and tuning, capacity planning, business continuity, etc.) that typify the teams of professionals that support organizations' mainframe computing environments. As BI applications continue to move from "nice to have" to "must have" in the eyes of corporate leaders, it stands to reason that IT executives would want to house these essential systems on the server platform that exemplifies "rock solid," and to assign their care to the people in whom they have the utmost trust and confidence -- mainframe people.&lt;br /&gt;&lt;br /&gt;One more trend driving BI workloads to System z is the increased frequency with which data warehouse databases are being updated. Not long ago, the "query by day, update at night" model predominated. Now, many BI application users demand that updates of source data values be reflected more quickly in the data warehouse database -- sometimes in a near-real-time manner. A lot of the source data that supplies data warehouses comes from mainframe databases and files, and locating the data warehouse close to that source data can facilitate round-the-clock updating.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Let's make a deal.&lt;/span&gt; The technical arguments for building a data warehouse on a mainframe platform are many and strong, but what about the financial angle? IBM has been pretty busy in this area of late. I already knew of &lt;a href="http://www-01.ibm.com/software/data/db2/zos/edition-vue.html"&gt;DB2 Value Unit Edition&lt;/a&gt; pricing, which makes DB2 for z/OS available for a one-time charge for net new workloads of certain types, including data warehousing. I'll admit to not having known about IBM's System z Solution Edition series (announced in August of last year) before Jim talked about them during his presentation. Included in this set of offerings is the &lt;a href="http://www-03.ibm.com/systems/z/solutions/editions/dw/index.html"&gt;System Z Solution Edition for Data Warehousing&lt;/a&gt;, a package of hardware, software (including DB2), and services that can help an organization to implement a mainframe-based data warehouse system in a cost-competitive way.&lt;br /&gt;&lt;br /&gt;If your organization is serious about data warehousing, get serious about your data warehouse platform. Mainframes deliver the availability, mixed workload performance management, security, and -- yes -- total cost of ownership that can improve your chances of achieving BI success. Analyze that.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-7985844551903064029?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/7985844551903064029/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/02/good-news-on-mainframe-db2-data.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/7985844551903064029'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/7985844551903064029'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/02/good-news-on-mainframe-db2-data.html' title='Good News on the Mainframe DB2 Data Warehousing Front'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-7900190741617128917</id><published>2010-01-27T19:17:00.000-08:00</published><updated>2010-01-27T19:35:40.821-08:00</updated><title type='text'>Some Basic Information About SQL in DB2-Accessing Programs</title><content type='html'>&lt;span style="font-family: arial;"&gt;DB2 has been around for a long time (more than 25 years), and a lot of people who work with DB2 have been doing so for a long time (myself included). Many of the younger folks I meet who are professionally engaged in DB2-related activity are developers. Some of these who came to DB2 after working with other relational database management systems might have been initially confused on hearing their DB2-knowledgeable colleagues talk about SQL as being "embedded" or "static" or "dynamic." Waters may have been further muddied when familiar words such as "package," "collection," and "plan" took on unfamiliar meanings in discussions about DB2-based applications. Throw in a few terms like "DBRM" and "consistency token," and you can really have a DB2 newbie scratching his or her head. Of late, I've seen enough misunderstanding in relation to programming for DB2 data access. My hope is that this post will provide some clarity. Although I am writing from a DB2 for z/OS perspective, the concepts are essentially the same in a DB2 for Linux/UNIX/Windows environment (some of the terminology is a little different).&lt;br /&gt;&lt;br /&gt;First up for explanation: embedded SQL. Basically, this refers to SQL statements, included in the body of a program, that are converted into a structure, called a package, that runs in the DB2 database services address space when the program executes. The package is generated through a mechanism, known as the bind process, that operates on a file called a database request module, or DBRM. The DBRM, which contains a program's embedded SQL statements in a bind-ready form, is one of two outputs produced when the program is run through the DB2 precompiler. The other of these outputs is a modified version of the source program, in which the embedded SQL statements have been commented out and to which calls to DB2 have been added -- one call per SQL statement. Each of these DB2 calls contains a statement number, the name of the program's DB2 package, and a timestamp-based identifier called a consistency token. The statement numbers and the consistency token are also included in the aforementioned DBRM, and these serve to tie the program in it's compiled and linked form to the package into which the DBRM is bound: at program execution time, a DB2 call indicates the package to use (the match is on package name and and consistency token value), and identifies the section of the package that corresponds to the SQL statement to be executed.&lt;br /&gt;&lt;br /&gt;The above paragraph is kind of a mouthful. Here's the key concept to keep in mind: the package associated with a program containing embedded SQL is generated before the program ever runs. To put it another way, DB2 gets to see the embedded SQL statements and prepare them for execution (by doing things like access path selection) before they are issued by the application program.&lt;br /&gt;&lt;br /&gt;A few more items of information related to packages:&lt;/span&gt;&lt;span style="font-family: arial;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Packages are persistent -- they are stored in a system table in the DB2 directory database and loaded into memory when needed. Once cached in the DB2 for z/OS environmental descriptor manager pool (aka the EDM pool) a package is likely to stay memory-resident for some time, but if it eventually gets flushed out of the pool (as might happen if it goes for some time without being referenced), it will again be read in from the DB2 directory when needed.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Packages are organized into groups called collections.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;For &lt;/span&gt;&lt;span style="font-family: arial;"&gt;application processes that are local to a DB2 for z/OS subsystem (i.e., that run in the same z/OS system as the target DB2 data server), packages are executed through plans. So, a batch job running in a JES initiator address space -- or a CICS transaction, or an IMS transaction -- will provide to DB2 the name of a plan, which in turn points to one or more collections that contain the package or packages associated with the embedded SQL statements that the application process will issue. Applications that are remote to the DB2 subsystem and communicate with DB2 through the Distributed Data Facility using the DRDA protocol (Distributed Relational Database Architecture) make use of packages, but they do not refer to DB2 plans.&lt;/span&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;What about dynamic versus static SQL? Plenty of people who know a lot about DB2 will tell you that "static SQL" means the same thing as "embedded SQL." In my mind, the two terms are almost equivalent (some would say that this "almost" of mine is a matter of splitting hairs). It's true that static SQL is seen by DB2 and prepared for execution before the associated program is executed. That's what I said about embedded SQL, isn't it? Yes, but there is something called embedded dynamic SQL. That would be an SQL statement that is placed in a host variable that is subsequently referenced by a PREPARE statement. The SQL statement string in the host variable is dynamic (that is to say, it will be prepared by DB2 for execution when it is issued by the program), but -- and this is the splitting-hairs part -- PREPARE itself is not a dynamic SQL statement.&lt;br /&gt;&lt;br /&gt;Dynamic SQL statements (again, those being statements that are prepared when issued by a program, versus being prepared beforehand through the previously described bind process) can of course be presented to DB2 without the use of PREPARE -- they can, for instance, take the form of ODBC (Open Database Connectivity) or JDBC (Java Database Connectivity) calls. They can also be issued interactively through tools such as SPUFI (part of the TSO/ISPF interface to DB2 for z/OS) and the command line processor (a component of DB2 for Linux/UNIX/Windows and of the DB2 Client).&lt;br /&gt;&lt;br /&gt;Some DB2 people hear "dynamic SQL" and think "ad-hoc SQL." In fact, these terms are NOT interchangeable. Ad-hoc SQL is free-form and unpredictable -- it could be generated by someone using a query tool in a data warehouse environment. Ad-hoc SQL will be dynamic (prepared for execution by DB2 when issued), but dynamic SQL certainly doesn't have to be ad-hoc. There are tons of examples of applications -- user-written and vendor-supplied -- that send SQL statements to DB2 in a way that will result in dynamica statement preparation. That doesn't mean that users have any control over the form of statements so issued (users might only be able to provide values that will be substituted for parameter markers in a dynamic SQL statement). "Structured dynamic" is the phrase I use when referring to this type of SQL statement. Just remember: static SQL CANNOT change from execution to execution (aside from changes in the values of host variables). Dynamic SQL CAN change from execution to execution, but it doesn't HAVE to.&lt;br /&gt;&lt;br /&gt;I'll close by pointing out that dynamic SQL statements are not, in fact, always prepared by DB2 at the time of their execution. Sometimes, they are prepared before their execution. I'm referring here to DB2's dynamic statement caching capability (active by default in DB2 V8 and V9 systems). When a dynamic SQL statement is prepared for execution, DB2 will keep a copy of the prepared form of the statement in memory. When the same statement is issued again (possibly with different parameter values if the statement was coded with parameter markers), DB2 can use the cached structure associated with the previously prepared instance of the statement, thereby saving the CPU cycles that would otherwise be consumed in re-preparing the statement from scratch. Dynamic statement caching is one of the key factors behind the growing popularity and prevalence of dynamic SQL in mainframe DB2 environments.&lt;br /&gt;&lt;br /&gt;I hope that this overview of SQL in DB2-accessing programs will be useful to application developers and others who work with DB2. When all is said and done, the value of a database management system to an organization depends in large part on the value delivered by applications that interact with the DBMS. Code on!&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-7900190741617128917?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/7900190741617128917/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/01/some-basic-information-about-sql-in-db2.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/7900190741617128917'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/7900190741617128917'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/01/some-basic-information-about-sql-in-db2.html' title='Some Basic Information About SQL in DB2-Accessing Programs'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-7652609304961340940</id><published>2010-01-13T08:17:00.000-08:00</published><updated>2011-09-20T09:00:45.759-07:00</updated><title type='text'>DB2 for z/OS Data Sharing: Then and Now (Part 2)</title><content type='html'>&lt;span style="font-family:arial;"&gt;&lt;a href="http://catterallconsulting.blogspot.com/2010/01/db2-for-zos-data-sharing-then-and-now.html"&gt;In part 1&lt;/a&gt; of this two-part entry, posted last week, I wrote about some of the more interesting changes that I've seen in DB2 for z/OS data sharing technology over the years (about 15) since it was introduced through DB2 Version 4. More specifically, in that entry I highlighted the tremendous improvement in performance with regard to the servicing of coupling facility requests, and described some of the system software enhancements that have made data sharing a more CPU-efficient solution for organizations looking to maximize the availability and scalability of a mainframe DB2 data-serving system. In this post, I'll cover changes in the way that people configure data sharing groups, and provide a contemporary view of a once-popular -- and perhaps now unnecessary -- application tuning action.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Putting it all together.&lt;/span&gt; Of course, before you run a DB2 data sharing group, you have to set one up. Hardware-wise, the biggest change since the early years of data sharing has been the growing use of internal coupling facilities, versus the standalone boxes that were once the only option available. The primary advantage of internal coupling facilities (ICFs), which operate as logical partitions within a System z server, with dedicated processor and memory resources, is economic: they cost less than external, standalone coupling facilities. They also offer something of a performance benefit, as communication between a z/OS system on a mainframe and an ICF on the same mainframe is a memory-to-memory operation, with no requirement for the traversing of a physical coupling facility link.&lt;br /&gt;&lt;br /&gt;When internal coupling facilities first came along in the late 1990s, organizations that acquired them tended to use only one in a given parallel sysplex (the mainframe cluster on which a DB2 data sharing group runs) -- the other coupling facility in the sysplex (you always want at least two, so as to avoid having a single point of failure) would be of the external variety. This was so because people wanted to avoid the effect of the so-called "double failure" scenario, in which a mainframe housing both an ICF and a z/OS system participating in the sysplex associated with the ICF goes down. Group buffer pool duplexing, delivered with DB2 Version 6 (with a subsequently retrofit to Version 5), allayed the double-failure concerns of those who would put the group buffer pools (GBPs) in an ICF on a server with a sysplex-associated z/OS system: if that server were to fail, taking the ICF down with it, the surviving DB2 subsystems (running on another server or servers) would simply use what had been the secondary group buffer pools in the other coupling facility as primary GBPs, and the application workload would continue to be processed by those subsystems (in the meantime, any DB2 group members on the failed server would be automatically restarted on other servers in the sysplex). Ah, but what of the lock structure and the shared communications area (SCA), the other coupling facility structures used by the members of a DB2 data sharing group? Companies generally wanted these in an external, standalone CF (or in an ICF on a server that did not also house a z/OS system participating in the sysplex associated with the ICF). Why? Because a double-failure scenario involving these structures would lead to a group-wide failure -- this because successful rebuild of the lock structure or SCA (which would prevent a group failure) requires information from ALL the members of a DB2 data sharing group. If a server has an ICF containing the lock structure and SCA (these are usually placed within the same coupling facility) and also houses a member of the associated DB2 data sharing group, and that server fails, the lock structure and SCA will not be rebuilt (because a DB2 member failed, too), and without those structures, the data sharing group will come down.&lt;br /&gt;&lt;br /&gt;Nowadays, parallel sysplexes configured with ICFs and &lt;span style="font-weight: bold;"&gt;no&lt;/span&gt; external CFs are increasingly common. For one thing, the double-failure scenario is less scary to a lot of folks than it used to be, because a) actually losing a System z server is exceedingly unlikely (it's rare enough for a z/OS or a DB2 or an ICF to crash, and rarer still for a mainframe itself to fail), and b) even if a double-failure involving the lock structure and SCA were to occur, causing the DB2 data sharing group to go down, the subsequent group restart process that would restore availability would likely complete within a few minutes (with duplexed group buffer pools, there would be no data sets in group buffer pool recover pending status, and restart goes much more quickly when there's no GRECP). Secondly, organizations that want insurance against even a very rare outage situation that would probably not exceed a small number of minutes in duration can eliminate the possibility of an outage-causing double-failure by implementing system duplexing of the lock structure and the SCA. System duplexing of the lock structure and SCA increases the overhead of running DB2 in data sharing mode (more so than group buffer pool duplexing), and that's why some organizations use it and some don't -- it's a cost versus benefit decision.&lt;br /&gt;&lt;br /&gt;Another relatively recent development with regard to data sharing set-up is the availability of 64-bit addressing in Coupling Facility Control Code, beginning with the CFLEVEL 12 release (Coupling Facility Control Code is the operating system of a coupling facility, and I believe that CFLEVEL 16 is the current release). So, how big can an individual coupling facility structure be? About 100 GB (actually, 99,999,999 KB -- the current maximum value that can be specified when defining a structure in the specification of a Coupling Facility Resource Management, or CFRM, policy). For the lock structure and the SCA, this structure size ceiling (obviously well below what's possible with 64-bit addressing) is totally a non-issue, as these structures are usually not larger than 128 MB each. Could someone, someday, want a group buffer pool to be larger than 100 GB? Maybe, but I think that we're a long way from that point, and I expect that the 100 GB limit on the size of a coupling facility structure will be increased well before then.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;DEALLOCATE or COMMIT?&lt;/span&gt; DB2 data sharing is, for the most part, invisible to application programs, and organizations implementing data sharing groups often find that use of the technology necessitates little, if anything, in the way of application code changes. That said, people have done various things over the years to optimize the CPU efficiency of DB2-accessing programs running in a data sharing environment. For a long time, one of the more popular tuning actions was to bind programs executed via persistent threads (i.e., threads that persist across commits, such as those associated with batch jobs and with CICS-DB2 protected entry threads) with the RELEASE(DEALLOCATE) option. This was done to reduce tablespace lock activity: RELEASE(DEALLOCATE) causes tablespace locks (not page or row locks) acquired by an application process to be retained until thread deallocation, as opposed to being released (and reacquired, if necessary) at commit points. The reduced tablespace lock activity would in turn reduce the type of data sharing global lock contention called XES contention (about which I wrote in last week's &lt;a href="http://catterallconsulting.blogspot.com/2010/01/db2-for-zos-data-sharing-then-and-now.html"&gt;part 1&lt;/a&gt; of this two-part entry). There were some costs associated with the use of RELEASE(DEALLOCATE) for packages executed via persistent threads (the size of the DB2 EDM pool often had to be increased, and package rebind procedures sometimes had to be changed or rescheduled to reflect the fact that some packages would remain in an "in use" state for much longer periods of time than before), but these were typically seen as being outweighed by the gain in CPU efficiency related to the aforementioned reduction in the level of XES contention.&lt;br /&gt;&lt;br /&gt;All well and good, but then (with DB2 Version 8) along came data sharing Locking Protocol 2 (also described in last week's &lt;a href="http://catterallconsulting.blogspot.com/2010/01/db2-for-zos-data-sharing-then-and-now.html"&gt;part 1&lt;/a&gt; post). Locking Protocol 2 drove XES contention way down, essentially eliminating XES contention reduction as a rational for pairing RELEASE(DEALLOCATE) with persistent threads. With this global locking protocol in effect, the RELEASE(DEALLOCATE) versus RELEASE(COMMIT) package bind decision is essentially unrelated to your use of data sharing. There are still benefits associated with the use of RELEASE(DEALLOCATE) for persistent-thread packages (e.g., a slight improvement in CPU efficiency due to reduced tablespace lock and EDM pool resource release and reacquisition, more-effective dynamic prefetch), but take away the data sharing efficiency gain of old that is now provided by Locking Protocol 2, and you might decide to be more selective in your use of RELEASE(DEALLOCATE). If you implement a DB2 data sharing group, you may just leave your package bind RELEASE specifications as they were in your standalone DB2 environment.&lt;br /&gt;&lt;br /&gt;What new advances in data sharing technology are on the way? We'll soon see: IBM is expected to formally announce DB2 Version "X," the successor to Version 9, later this year. I'll be looking for data sharing enhancements among the "What's New?" items delivered with Version X, and I'll likely report in this space on what I find. Stay tuned.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-7652609304961340940?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/7652609304961340940/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/01/db2-for-zos-data-sharing-then-and-now_13.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/7652609304961340940'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/7652609304961340940'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/01/db2-for-zos-data-sharing-then-and-now_13.html' title='DB2 for z/OS Data Sharing: Then and Now (Part 2)'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-2103654764935942382</id><published>2010-01-08T10:29:00.000-08:00</published><updated>2010-01-08T10:39:14.733-08:00</updated><title type='text'>DB2 for z/OS Data Sharing: Then and Now (Part 1)</title><content type='html'>&lt;span style="font-family: arial;"&gt;A couple of months ago, I did some DB2 data sharing planning work for a financial services organization. That engagement gave me an opportunity to reflect on how far the technology has come since it was introduced in the mid-1990s via DB2 for z/OS Version 4 (I was a part of IBM's DB2 National Technical Support team at the time, and I worked with several organizations that were among the very first to implement DB2 data sharing groups on parallel sysplex mainframe clusters). This being the start of a new year, it seems a fitting time to look at where DB2 data sharing is now as compared to where it was about fifteen years ago. I'll do that by way of a 2-part post, focusing here on speed gains and smarter software, and in part 2 (sometime next week) on configuration changes and application tuning considerations.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Speed, and more speed.&lt;/span&gt; One of most critical factors with respect to the performance of a data sharing group is the speed with which a request to a coupling facility can be processed. In a parallel sysplex, the coupling facilities provide the shared memory resource in which DB2 structures such as the global lock structure and the group buffer pools are located. Requests to these structures (e.g., the writing of a changed page to a group buffer pool, or the propagation of a global lock request for a data page or a row) have to be processed exceedingly quickly, because 1) the volume of requests can be very high (thousands per second) and 2) most DB2-related coupling facility requests are synchronous, meaning that the mainframe engine that drives such a request will wait, basically doing nothing, until the coupling facility response is received (this is so for performance reasons: the request-driving mainframe processor is like a runner in a relay race, waiting with outstretched hand to take the baton from a teammate and immediately sprint with it towards the finish line). This processor wait time associated with synchronous coupling facility requests, technically referred to as "dwell time," has to be minimized because it is a key determinant of data sharing overhead (that being the difference in the CPU cost of executing an SQL statement in a data sharing environment versus the cost of executing the same statement in a standalone DB2 subsystem).&lt;br /&gt;&lt;br /&gt;In the late 1990s, people who looked after DB2 data sharing systems were pretty happy if they saw average service times for synchronous requests to the group buffer pools and lock structure that were under 250 and 150 microseconds, respectively. Nowadays, sites report that their average service times for synchronous group buffer pool and lock structure requests are less than 20 microseconds. This huge improvement is due in large part to two factors. First, coupling facility engines are MUCH faster than they were in the old days. If you know mainframes, you know about this even if you are not familiar with coupling facilities, because coupling facility microprocessors are identical, hardware-wise, to general purpose System z engines -- they just run Coupling Facility Control Code instead of z/OS. Today's z10 microprocessors pack 10 times the compute power delivered by top-of-the-line mainframe engines a decade ago. The second big performance booster with regard to coupling facility synchronous request service times is the major increase in coupling facility link capacity versus what was available in the 1999-2000 time frame. Back then, many of the links in use had an effective data rate of 250 MB per second. Current links can move information at 2 GB (2000 MB) per second.&lt;br /&gt;&lt;br /&gt;This big improvement in performance related to the servicing of synchronous coupling facility requests helped to improve throughput in data sharing systems. Did it also reduce the CPU cost of data sharing? Yes, but it's only part of that story. DB2 data sharing is a more CPU-efficient technology now than it was in the 1990s: overhead in an environment characterized by lots of what I call inter-DB2 write/write interest (referring to the updating of individual database objects -- tables and indexes -- by multiple application processes running concurrently on different members of a DB2 data sharing group) was once generally in the 10-20% range. Now the range is more like 8-15%. That improvement wasn't helped along all that much by faster coupling facility engines. Sure, they lowered service times for synchronous coupling facility requests, but the resulting reduction in the aforementioned mainframe processor "dwell time" was offset by the fact that much-faster mainframe engines forgo handling a lot more instructions during a given period of "dwelling" versus their slower predecessors (in other words, as mainframe processors get faster, you have to drive down synchronous request service times just to hold the line on data sharing overhead). Faster coupling facility links helped to reduce overhead, but I think that improvements in DB2 data sharing CPU efficiency have at least as much to do with system software changes as with speedier servicing of coupling facility requests. A tip of the hat, then, to the DB2, z/OS, and Coupling Facility Control Code development teams.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Working smarter, not harder.&lt;/span&gt; Over the years, IBM has delivered a number of software enhancements that lowered the CPU cost of DB2 data sharing. Some of these code changes boosted efficiency by reducing the number of coupling facility requests that would be generated in the processing of a given workload. A DB2 Version 5 subsystem, for example (and recall that data sharing was introduced with DB2 Version 4), was able to detect more quickly that a data set that it was updating and which had been open on multiple members of a data sharing group was now physically closed on the other members. As a result, the subsystem could take the data set out of the group buffer pool dependent state sooner, thereby eliminating associated coupling facility group buffer pool requests. DB2 Version 6 introduced the MEMBER CLUSTER option of CREATE TABLESPACE, enabling organizations to reduce coupling facility accesses related to space map page updating for tablespaces with multi-member insert "hot spots" (these can occur when multiple application processes running on different DB2 members are driving concurrent inserts to the same area within a tablespace). DB2 Version 8 took advantage of a Coupling Facility Control Code enhancement, the WARM command (short for Write And Register Multiple -- a means of batching coupling facility requests), to reduce the number of group buffer pool writes necessitated by a page split in a group buffer pool dependent index from five to one.&lt;br /&gt;&lt;br /&gt;These code improvements were all welcome, but my all-time favorite efficiency-boosting enhancement is the Locking Protocol 2 feature delivered with DB2 Version 8. Locking Protocol 2 lowered data sharing overhead by virtually eliminating a type of global lock contention known as XES contention. Prior to Version 8, if an application process running on a member of a data sharing group requested an IX or IS lock on a tablespace (indicating an intent to update or read from the tablespace in a non-exclusive manner), that request would be propagated by XES (Cross-System Extended Services, a component of the z/OS operating system) to the coupling facility lock structure as an X or S lock request (exclusive update or exclusive read) on the target object, because those are the only logical lock states known to XES. Thus, if process Q running on DB2 Version 7 member DB2A requests an IX lock on tablespace XYZ, that request will get propagated to the lock structure as an X lock on the tablespace. If process R running on DB2 Version 7 member DB2B subsequently requests an IS lock on the same tablespace, that request will be propagated to the lock structure as an S lock on the resource. If process Q on DB2A still holds its lock on tablespace XYZ, the lock structure will detect the incompatibility of the X and S locks on the tablespace, and DB2B will get a contention-detected response from the coupling facility. The z/OS image under which DB2B is running will contact the z/OS associated with DB2A in an effort to resolve the apparent contention situation. DB2A's z/OS then drives an IRLM exit (IRLM being the lock management subsystem used by DB2), supplying the target resource identifier (for tablespace XYZ) and the actual logical lock requests involved in the XES-perceived conflict (IX and IS). IRLM will indicate to the z/OS system that these logical lock states are in fact compatible, and DB2B's z/OS will be informed that application process R can in fact get it's requested lock on tablespace XYZ. The extra processing required to determine that the conflict perceived by XES is in fact not a conflict adds to the cost of data sharing.&lt;br /&gt;&lt;br /&gt;With locking protocol 2, the aforementioned scenario would play out as follows: the request by application process Q on DB2 Version 8 (or 9) member DB2A for an IX lock on tablespace XYZ gets propagated to the lock structure as a request for an S lock on the object. The subsequent request by application process R on DB2 Version 8 (or 9) member DB2B is also propagated as an S lock request, and because S locks on a resource are compatible with each other, no contention is indicated in the response to DB2B's system from the coupling facility, and the lock request is granted. Because IX and IS tablespace lock requests are very common in a DB2 environment, and because IX-IX and IX-IS lock situations involving application access to a given tablespace from different data sharing members are not perceived by XES as being in-conflict when Locking Protocol 2 is in effect, almost all of the XES contention that would be seen in a pre-Version 8 DB2 data sharing group goes away, and data sharing CPU overhead goes down. I say almost all, because the S-IS tablespace lock situation (exclusive read on the one side, and non-exclusive read on the other), which would not result in XES contention with Locking Protocol 1, does cause XES contention with Locking Protocol 2 (the reason being that Locking Protocol 2 causes an S tablespace lock request to be propagated to the lock structure as an X request -- necessary to ensure that an X-IX tablespace lock situation will be correctly perceived by XES as being in-conflict). This is generally not a big deal, because S tablespace lock requests tend to be quite unusual in most DB2 systems.&lt;br /&gt;&lt;br /&gt;So, DB2 for z/OS data sharing technology, which was truly ground-breaking when introduced, has not remained static in the years since -- it's gotten better, and it will get better still in years to come. That's good news for data sharing users. If your organization doesn't have a DB2 data sharing group, make a new year's resolution to look into it. In any case, stop by the blog next week for my part 2 entry on the "then" and "now" of data sharing.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-2103654764935942382?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/2103654764935942382/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2010/01/db2-for-zos-data-sharing-then-and-now.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2103654764935942382'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2103654764935942382'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2010/01/db2-for-zos-data-sharing-then-and-now.html' title='DB2 for z/OS Data Sharing: Then and Now (Part 1)'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-3476539165192280880</id><published>2009-12-29T07:13:00.000-08:00</published><updated>2009-12-29T08:32:14.057-08:00</updated><title type='text'>Clearing the Air Re: Indexes on DB2 for z/OS Partitioned Tables</title><content type='html'>&lt;span style="font-family:arial;"&gt;With the end of the year in sight, it's a good time to tie up loose ends, as we say here in the USA. Thus it is that I've decided to focus, in this last post to my blog in 2009, on indexes as they pertain to DB2 for z/OS partitioned tables. That subject qualifies as a "loose end," because several &lt;span&gt;years&lt;/span&gt; after the introduction of table-controlled partitioning with DB2 for z/OS V8, some folks are still not certain as to what can and cannot be done with indexes defined on partitioned tables. I'll try, in this entry, to clear things up.&lt;br /&gt;&lt;br /&gt;When a table is partitioned by way of an index specification (the only way to partition a table prior to DB2 V8), index options for the table are pretty straightforward. The index that describes the partitioning scheme (i.e., the one with the PART &lt;span style="font-style: italic;"&gt;integer&lt;/span&gt; VALUES (&lt;span style="font-style: italic;"&gt;constant&lt;/span&gt;) clause in the CREATE INDEX statement) is called the partitioning index. &lt;span&gt;No other index&lt;/span&gt;&lt;span style="font-weight: bold;"&gt; &lt;/span&gt;on that table is called a partitioning index, and no other index on the table is physically partitioned. &lt;span&gt;Any&lt;/span&gt; index on a unique key can be defined as UNIQUE. &lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;span&gt;Only&lt;/span&gt;&lt;span style="font-weight: bold;"&gt; &lt;/span&gt;the partitioning index can be the table's clustering index.&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;br /&gt;Starting with DB2 V8, table partitioning can be controlled by way of a table's definition (through the PARTITION BY and PARTITION &lt;span style="font-style: italic;"&gt;integer&lt;/span&gt; ENDING AT (&lt;span style="font-style: italic;"&gt;constant&lt;/span&gt;) clauses of CREATE TABLE). Table-controlled partitioning (enhanced in DB2 V9 via the partition-by-range universal tablespace) is way better than index-controlled partitioning, but this important DB2 advancement did change -- considerably -- the landscape as far as index options are concerned. First and foremost: for a table-controlled partitioned table, &lt;span style="font-weight: bold;"&gt;any&lt;/span&gt; index on a key that starts with the table's partition-by column or columns is called a partitioning index &lt;/span&gt;&lt;span style="font-family:arial;"&gt;(and "starts with" means that the columns of a multi-column partitioning key appear in the order specified in the CREATE TABLE statement)&lt;/span&gt;&lt;span style="font-family:arial;"&gt;; so, if a table's partitioning key is COL_X, COL_Y then an index on COL_X, COL_Y, COL_A is a partitioning index, and so is an index on COL_X, COL_Y, COL_B (but an index on COL_Y, COL_X, COL_D would not be a partitioning index, because the order of the partition-by columns does not match the order specified in the table's definition). Among the implications of this rule: a table-controlled partitioned table may have several partitioning indexes, or it may not have &lt;span style="font-weight: bold;"&gt;any&lt;/span&gt; partitioning indexes. Furthermore, a partitioning index may or may not be physically partitioned. Continuing with the example of the table partitioned on COL_X, COL_Y, an index on COL_X, COL_Y, COL_D that does not have the PARTITIONED clause in its definition is partitioning (because its key begins with the table's partitioning key) but not partitioned (because it was not defined with the PARTITIONED clause). Possible, of course, doesn't necessarily mean advisable -- I don't see why you would have a partitioning index that is not also partitioned.&lt;br /&gt;&lt;br /&gt;Next: for a table-controlled partitioned table, a secondary index (i.e., one that is not a partitioning index) can itself be partitioned -- you just have to specify PARTITIONED in the definition of the index (this will cause the index to be physically partitioned along the lines of the underlying table, so that partition 1 of the secondary index will contain the keys for rows in partition 1 of the table). A secondary index that is partitioned is called a data-partitioned secondary index, or DPSI (an index that is not partitioned is called a non-partitioned index, or NPI).&lt;br /&gt;&lt;br /&gt;Third: there is a restriction on DPSIs with regard to uniqueness (a restriction that was loosened somewhat with DB2 V9). In a DB2 for z/OS V8 environment, &lt;span style="font-weight: bold;"&gt;no&lt;/span&gt; DPSI can be defined as unique (and remember: a DPSI is a partitioned index that is not a partitioning index -- a partitioning index can be unique). If you try to create a secondary index with both UNIQUE and PARTITIONED in the index definition, you'll get a -628 SQL code (and an accompanying error message indicating that "clauses are mutually exclusive"). In a DB2 V9 environment (and this was recently pointed out by DB2 consultant Peter Backlund in a thread on the &lt;a href="http://www.idug.org/cgi-bin/wa?A0=DB2-L"&gt;DB2-L&lt;/a&gt; discussion forum), a DPSI &lt;span style="font-weight: bold;"&gt;can&lt;/span&gt; be defined as UNIQUE &lt;span style="font-weight: bold;"&gt;if&lt;/span&gt; the index key contains the table's partition by column or columns. Once again, consider the table partitioned on COL_X, COL_Y. A DPSI defined on COL_A, COL_Y, COL_B, COL_X could have the UNIQUE attribute because the index key contains all of the table's partition-by columns (and note that the partition-by columns do not have to be in any particular order within the DPSI's key -- they just have to be present within the key). Can there be multiple unique DPSIs defined on a DB2 V9 table-controlled partitioned table? Yes -- again, what's required is that the underlying table's partition-by columns be included in the key of a DPSI that is to be defined as UNIQUE.&lt;br /&gt;&lt;br /&gt;Finally: with regard to the CLUSTER attribute, you have flexibility with a table-controlled partitioned table that you don't have with an index-controlled partitioned table. For an index-controlled partitioned table, the partitioning index &lt;span style="font-weight: bold;"&gt;will be&lt;/span&gt; the table's clustering index. For a table-controlled partitioned table, &lt;span style="font-weight: bold;"&gt;any&lt;/span&gt; one index can be the table's clustering index (and of course, a table can only have one index with the CLUSTER attribute). The clustering index could be a partitioning index or a secondary index (whether a DPSI or an NPI). The ability to cluster a table with one key and partition it by another key is, in my opinion, one of the key advantages of table-controlled partitioning over index-controlled partitioning (other pluses include the ability to add partitions to a table-controlled partitioned table, and the ability to rotate partitions in a "first to last" manner).&lt;br /&gt;&lt;br /&gt;Is all that clear? I hope so. Table-controlled partitioning is a VERY good thing -- well worth the effort of getting your arms around the new rules regarding indexes on table-controlled partitioned tables.&lt;br /&gt;&lt;br /&gt;Throughout my 27 years in IT, I've enjoyed the constancy of opportunities to learn new things. I look forward to more of the same in 2010. Have fun ringing in the new year!&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-3476539165192280880?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/3476539165192280880/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/12/clearing-air-re-indexes-on-db2-for-zos.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3476539165192280880'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3476539165192280880'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/12/clearing-air-re-indexes-on-db2-for-zos.html' title='Clearing the Air Re: Indexes on DB2 for z/OS Partitioned Tables'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-6555650123127769130</id><published>2009-12-10T06:21:00.000-08:00</published><updated>2011-09-20T08:57:37.688-07:00</updated><title type='text'>DB2 for z/OS and the Disk Subsystem</title><content type='html'>&lt;span style="font-family:arial;"&gt;Earlier this week, we in the &lt;a href="http://www.praxiumgroup.com/adug.html"&gt;Atlanta DB2 Users Group&lt;/a&gt; were treated to a day with Roger Miller, the ebullient IBMer who, more than anyone else, has been the "face" of the DB2 for z/OS development organization since the product debuted more than 25 years ago (Roger spends a great deal of time in front of DB2 users -- at customer sites, at the briefing center at IBM's Silicon Valley Lab, and at user group meetings). Roger provided us with six hours of DB2 information, laced with a little bit of Shakespeare (appropriate, as Roger's presentation style certainly brings to mind the line, "All the world's a stage," from Shakespeare's &lt;span style="font-style: italic;"&gt;As You Like It&lt;/span&gt;). Some of the material that I found to be particularly interesting had to do with advances in disk I/O technology over the years, and the effect of those enhancements on DB2-related I/O performance. In this post, I'll share some of that information with you, and I'll include a few of my own observations on DB2 and the disk subsystem, based on my work with DB2 for z/OS-using organizations.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Faster and faster.&lt;/span&gt; Roger talked about the major gains seen in mainframe processor performance over the past 10 years (the "clock rate" of the engines in IBM's top-of-the line z10 mainframe server is 4.4 GHz, versus 550 MHz for the G6 line in 1999), and he went on to point out that improvements in disk I/O performance have been just as dramatic. As I noted &lt;a href="http://catterallconsulting.blogspot.com/2008/10/so-what-makes-for-good-db2-io.html"&gt;in a blog entry on DB2 I/O performance&lt;/a&gt; that I posted last year, a target of 20-30 milliseconds of wait time per synchronous read (i.e., for an on-demand read of a single 4K DB2 page into memory from the disk subsystem) was pretty much the norm well into the 1990s (this figure is usually obtained from a DB2 monitor accounting report or an online display of DB2 accounting data). Nowadays, 2 milliseconds of wait time per synchronous read is fairly common. There are a number of factors behind this order-of-magnitude improvement in disk I/O performance, none more important than the large increase in disk controller cache memory sizes since the mid-90s (32 MB of cache once seemed like a lot in the old days -- now you can get more than 300 GB of cache on a control unit), and the development of sophisticated algorithms to optimize the effective use of the cache resource (enterprise-class disk controllers run these algorithms on multiple high-performance microprocessors).&lt;br /&gt;&lt;br /&gt;The cache impact on disk I/O performance was underscored by some numbers that Roger shared with our group: while the time required to retrieve a DB2 page from spinning disk will generally be between 4 and 8 milliseconds (a big improvement versus 1990s-era disk subsystems), a disk controller cache hit results in a wait time of 230 to 290 microseconds for a synchronous read (the low end of this range is for a system with the z10 High-Performance FICON I/O architecture, also known as zHPF).&lt;br /&gt;&lt;br /&gt;For DB2 prefetch reads, the gains are even more impressive. A decade or so ago, reading 64 4K pages into a DB2 buffer pool in 90 milliseconds was thought to be good performance. These days, it's possible to get 64 pages into memory in 1.5 milliseconds -- a 60X improvement over the old standard.&lt;br /&gt;&lt;br /&gt;Looking beyond cache, a fairly recent arrival on the enterprise storage scene is solid-state disk technology (SSD). Actually, SSD itself isn't all that new -- devices of this type have been in use for about 30 years. What's new is the cost-competitiveness of enterprise-class SSD systems -- still several times more expensive than a traditional spinning-disk system, on a cost-per-gigabyte basis, but close enough now to warrant serious consideration for certain performance-critical database objects that tend to be accessed in a random fashion (as versus a sequential processing pattern). Random access is the SSD sweet spot because the technology eliminates the seek time that elongates random I/O service times when pages are read from spinning disk; thus, the wait time for a DB2 synchronous read from an SSD devices will likely be about 740-840 microseconds, versus the aforementioned 4-8 milliseconds for a read from spinning disk.&lt;br /&gt;&lt;br /&gt;64-bit addressing means that DB2 buffer pools can be way bigger than before, and that's good for performance because no I/O operation -- even one that results in a cache hit -- will come close to the speed with which a page in memory can be accessed. That said, a large-scale DB2 application accessing a multi-terabyte database is likely to drive a lot of I/O activity, even if you have a really big buffer pool configuration. Advanced disk storage and I/O subsystem technology make those page reads much less of a drag on application performance than they otherwise would be.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Disk space utilization: don't overdo it. &lt;/span&gt;High-performance storage systems cost plenty of money, and people who purchase the devices don't want to have a lot of unused disk capacity on the floor. Some, however, push the target space-utilization threshold too far. See, a storage system isn't just about application performance -- it's also about application availability. Fill your disk volumes too full, and you're asking for DB2 data set space allocation failures (and associated application outages). How full is too full? I can tell you that in my own experience, DB2 data set space allocation failures tend to be a problem when an organization goes for a disk space utilization rate of 90% or more (and keep in mind that we're not just talking about data set extensions related to data-load operations -- we're also talking about utility-related disk space usage for things like sort work files and online REORG shadow data sets). At the other end of the spectrum, I've seen a situation in which an organization set a 60% threshold for disk volume space utilization. This company's DBAs liked that policy a lot (it really cut down on middle-of-the-night rousting of whoever was on-call), but I can't say that I'd advocate such a target -- it seems too low from a cost-efficiency perspective.&lt;br /&gt;&lt;br /&gt;Where do I stand? I'm pretty comfortable with a disk space utilization target in the 70-80% range. I might lean towards the lower end of that range in a DB2 Version 9 environment, as partition-level online REORG jobs will reorganize non-partitioning indexes in their entirety, thereby necessitating more shadow data set space (a potentially compensating factor would be DB2 9 index compression, which can significantly reduce disk space requirements for index data sets). Something else to keep in mind: in addition to avoiding overly-high utilization of disk space, another good practice with regard to minimizing DB2 data set space allocation failures is to let DB2 determine the amount of space that will be requested when a data set has to be extended. This capability, sometimes called sliding-scale space allocation, was introduced with DB2 for z/OS Version 8. &lt;a href="http://catterallconsulting.blogspot.com/2009/09/db2-managed-disk-space-allocation.html"&gt;I blogged about it a few weeks ago&lt;/a&gt;, and I highly recommend its usage. People who have taken advantage of this functionality have expressed great satisfaction with the results: much less DBA time spent in managing DB2 data set allocation processing, and a sharp drop in the occurrence of DB2 space allocation-related failures. Check it out.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-6555650123127769130?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/6555650123127769130/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/12/db2-for-zos-and-disk-subsystem.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6555650123127769130'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6555650123127769130'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/12/db2-for-zos-and-disk-subsystem.html' title='DB2 for z/OS and the Disk Subsystem'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-6135422575328598175</id><published>2009-11-30T20:13:00.000-08:00</published><updated>2011-09-20T08:54:14.289-07:00</updated><title type='text'>DB2: DPSI Do, or DPSI Don't?</title><content type='html'>&lt;span style="font-family:arial;"&gt;One of the more interesting features delivered in recent years for DB2 for z/OS is the data-partitioned secondary index, or DPSI (pronounced "DIP-see"). Introduced with DB2 for z/OS Version 8, DPSIs are defined on table-controlled partitioned tablespaces (i.e., tablespaces for which the partitioning scheme is controlled via a table-level -- as opposed to an index-level -- specification), and are a) non-partitioning (meaning that the leading column or columns of the index key are not the same as the table's partitioning key), and b) physically partitioned along the lines of the table's partitions (meaning that partition 1 of a DPSI on partitioned table XYZ will contain entries for all rows in partition 1 of the underlying table -- and only for those rows).&lt;br /&gt;&lt;br /&gt;[For more information about partitioned and partitioning indexes on table-controlled partitioned tablespaces, &lt;a href="http://catterallconsulting.blogspot.com/2009/09/of-db2-for-zos-indexes-partitioneduh.html"&gt;see my blog entry on the topic&lt;/a&gt;, posted a few weeks ago.]&lt;br /&gt;&lt;br /&gt;Here's what makes DPSIs really interesting to me:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;They are a great idea in some situations.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;They are a bad idea in other situations.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;You may change the way you think about them in a DB2 9 context, versus a DB2 V8 environment.&lt;/span&gt;&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;DPSI do:&lt;/span&gt; a great example of very beneficial DPSI use showed up recently in the form of a question posted to the DB2-L discussion list (a great forum for technical questions related to DB2 for z/OS and DB2 for Linux, UNIX, and Windows -- visit &lt;a href="http://www.idug.org/cgi-bin/wa?A0=DB2-L"&gt;the DB2-L page&lt;/a&gt; on the Web to join the list). The DB2 DBA who sent in the question was working on the design of a data purge process for a particular table in a database managed by DB2 for z/OS V8. The tablespace holding the data was partitioned (in the table-controlled way), with rows spread across seven partitions -- one for each day of the week. The sole purge criterion was date-based, and the DBA was thinking of using a partition-level LOAD utility, with REPLACE specified and an empty input data set, to clear out a partition's worth of data (a good thought, as purging via SQL DELETE can be pretty CPU-intensive if there are a lot of rows to be removed from the table at one time). He indicated that several non-partitioning indexes (or NPIs) were defined on the table, and he asked for input from "listers" (i.e., members of the DB2-L community) as to the impact that these NPIs might have on his proposed partition-level purge plan.&lt;br /&gt;&lt;br /&gt;Isaac Yassin, a consultant and noted DB2 expert, was quick to respond to the question with a warning that the NPIs would have a very negative impact on the performance of the partition-level LOAD REPLACE operation. Isaac's answer was right on the money. It's true that the concept of a logical partition in an NPI (i.e., the index entries associated with rows located in a partition of the underlying table) makes a partition-level LOAD REPLACE compatible, from a concurrency perspective, with application access to other partitions in the target tablespace (a LOAD REPLACE job operating on partition X of table Y will get a so-called drain lock on logical partition X of each NPI defined on table Y). Still, technically feasible doesn't necessarily mean advisable. For a partitioned tablespace with no NPIs, a LOAD REPLACE with an empty input data set is indeed a very CPU- and time-efficient means of removing data from a partition, because it works at a data set level: the tablespace partition's data set (and the data set of the corresponding physical partition of each partitioned index) is either deleted and redefined (if its is a DB2-managed data set) or reset to empty (if it is a user-managed data set, or if it is DB2-managed and REUSE was specified on the utility control statement), and -- thanks to the empty input data set -- you have a purged partition at very little cost in terms of CPU consumption.&lt;br /&gt;&lt;br /&gt;Put NPIs in that picture, and the CPU and elapsed time story changes for the worse. Because the NPIs are not physically partitioned, the index-related action of the partition-level LOAD REPLACE job (the deletion of index entries associated with rows in the target table partition) is a page-level operation. That means lots of GETPAGEs (CPU time!), potentially lots of I/Os (elapsed time!), and lots of index leaf page latch requests (potential contention issue with respect to application processes concurrently accessing other partitions, especially in a data sharing environment). Nick Cianci, an IBMer in Australia, suggested changing the NPIs to DPSIs, and that's what the DBA who started the DB2-L thread ended up doing (as I confirmed later -- he's a friend of mine). With the secondary indexes physically partitioned along the lines of the table partitions, the partition-level LOAD REPLACE with the empty input data set will be what the DBA wanted: a very fast, very efficient means of removing a partition's worth of data from the table without impacting application access to other partitions.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;DPSI don't:&lt;/span&gt; there's a flip side to the partition-level utility benefits that can be achieved through the use of data partitioned (versus non-partitioned) secondary indexes, and it has to do with query performance. When one or more of a query's predicates match the columns of the key of a non-partitioned index, DB2 can quickly zero in on qualifying rows. If that same index is data-partitioned, and if none of the query's predicates reference the partitioning key of the underlying table, DB2 might not be able to determine that a partition of the table contains any qualifying rows without checking the corresponding partition of the DPSI (in other words, DB2 might not be able to utilize page range screening to limit the number of partitions that have to be searched for qualifying rows). If a table has just a few partitions, this may not be such a big deal. But if the table has hundreds or thousands of partitions, it could be a very big deal (imagine a situation in which DB2 has to search 1000 partitions of a DPSI in order to determine that the rows qualified by the query are located in just a few of the table's partitions -- maybe just one). In that case, degraded query performance might be too high of a price to pay for enhanced partition-level utility operations, and NPIs might be a better choice than DPSIs.&lt;br /&gt;&lt;br /&gt;That's exactly what happened at a large regional retailer in the USA. A DBA from that organization was in a DB2 for z/OS database administration class that I taught recently, and he told me that his company, on migrating a while back to DB2 V8, went with DPSIs for one of their partitioned tablespaces in order to avoid the BUILD2 phase of partition-level online REORG. [With NPIs, an online REORG of a partition of a tablespace necessitates BUILD2, during which the row IDs of entries in the logical partition (corresponding to the target tablespace partition) of each NPI are corrected to reflect the new physical location of each row in the partition. For a very large partition containing tens of millions of rows, BUILD2 can take quite some time to complete, and during that time the logical partitions of the NPIs are unavailable to applications. This has the effect of making the corresponding tablespace partition unavailable for insert and delete activity. Updates of NPI index key values are also not possible during BUILD2.]&lt;br /&gt;&lt;br /&gt;The DPSIs did indeed eliminate the need for BUILD2 during partition-level online REORGs of the tablespace (this because the physical DPSI partitions are reorganized in the same manner as the target tablespace partition), but they also caused response times for a number of queries that access the table to increase significantly -- this because the queries did not contain predicates that referenced the table's partitioning key, so DB2 could not narrow the search for qualifying rows to just one or a few of a DPSI's partitions. The query performance problems were such that the retailer decided to go back to NPIs on the partitioned table. To avoid the BUILD2-related data availability issue described above, the company REORGs the tablespace in its entirety (versus REORGing a single partition or a subset of partitions). [By the way, DPSI-related query performance problems were not an issue for the DBA who initiated the DB2-L thread referenced in the "DPSI do" part of this blog entry, because 1) the table in question is accessed almost exclusively for inserts, with only the occasional data-retrieval operation; and 2) the few queries that do target the table contain predicates that reference the table's partitioning key, so DB2 can use page-range screening to avoid searching DPSI partitions in which entries for qualifying rows cannot be present.]&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;DPSIs and DB2 9:&lt;/span&gt; the question as to whether or not you should use DPSIs is an interesting one, and it's made more so by a change introduced with DB2 9 for z/OS: the BUILD2 phase of partition-level online REORG has been eliminated (BUILD2 is no longer needed because the DB2 9 REORG TABLESPACE utility will reorganize NPIs in their entirety as part of a partition-level online REORG operation). Since some DB2 V8-using organizations went with DPSIs largely to avoid the data unavailability situation associated with BUILD2, they might opt to redefine DPSIs as NPIs after migrating to DB2 9, if DPSI usage has led to some degradation in query performance (as described above in the "DPSI don't" part of this entry). Of course BUILD2 avoidance (in a DB2 V8 system) is just one justification for DPSI usage. Even with DB2 9, using DPSIs (versus NPIs) is important if you need a partition-level LOAD REPLACE job to operate efficiently (see "DPSI do" above).&lt;br /&gt;&lt;br /&gt;Whether you have a DB2 V8 or a DB2 V9 environment, DPSIs may or may not be a good choice for your company (and it may be that DPSIs could be used advantageously for some of your partitioned tables, while NPIs would be more appropriate for other tables). Understand the technology, understand your organization's requirements (and again, these could vary by table), and make the choice that's right in light of your situation.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-6135422575328598175?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/6135422575328598175/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/11/db2-dpsi-do-or-dpsi-dont.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6135422575328598175'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6135422575328598175'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/11/db2-dpsi-do-or-dpsi-dont.html' title='DB2: DPSI Do, or DPSI Don&apos;t?'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-6068949536764372656</id><published>2009-11-17T07:17:00.000-08:00</published><updated>2009-11-18T07:00:21.275-08:00</updated><title type='text'>Mainframe DB2 Data Serving: Vision Becomes Reality</title><content type='html'>&lt;span style="font-family:arial;"&gt;One of the things I really like about attending DB2 conferences is the face-to-face time I get with people who otherwise would be on the other side of e-mail exchanges. I get a whole lot more out of in-person communication versus the electronic variety. Case in point: at IBM's recent Information on Demand event in Las Vegas, I ran into a friend who is a DB2 for z/OS database engineering leader at a large financial services firm. He talked up a new mainframe DB2 data serving reference architecture recently implemented for one of his company's mission-critical applications, and did so with an enthusiasm that could not have been fully conveyed through a text message. I got pretty fired up listening to the story this DBA had to tell, in part because of the infectious excitement with which it was recounted, but also because the system described so closely matches a vision of a DB2 for z/OS data-serving architecture that I've had in mind -- and have advocated -- for years. To see that vision validated in the form of a real-world system that is delivering high performance and high availability in a demanding production environment really made my day. I am convinced that what the aforementioned financial services firm (hereinafter referred to as Company XYZ) is doing represents the future of mainframe DB2 as a world-class enterprise data-serving platform. Read on if you want to know more.&lt;br /&gt;&lt;br /&gt;Three characteristics of the reference DB2 for z/OS data architecture (so called because it is seen as the go-forward model by the folks at Company XYZ) really stand out in my mind and make it an example to be emulated:&lt;br /&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;It is built on a DB2 data sharing / parallel sysplex foundation, for maximum availability and scalability (not only that -- these folks have done data sharing Really Right, as I'll explain).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;It leverages Big Memory (aka 64-bit addressing) for enhanced performance.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;The software stack on the mainframe servers is pretty short -- these are database machines, plain and simple.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-family:arial;"&gt;A little elaboration now on these three key aspects of Company XYZ's DB2 for z/OS reference architecture:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The robust foundation: a DB2 data sharing group on a parallel sysplex mainframe cluster.&lt;span style="font-weight: bold;"&gt; &lt;/span&gt;&lt;/span&gt;It's well known that a standalone System z server running z/OS and DB2 can be counted on to provide very high levels of availability and scalability for a data-serving workload. These core strengths of the mainframe platform are further magnified when concurrent read/write access to the database is shared by multiple DB2 members of a data sharing group, running in the multiple z/OS LPARs (logical partitions) and multiple System z servers of a parallel sysplex. You're not going to beat the uptime delivered by that configuration: formerly planned outages for maintenance purposes are virtually eliminated (service levels of of DB2, z/OS, and other software components can be updated, and server hardware can be upgraded, with no -- I mean zero -- interruption of application access to the database), and the impact of an unplanned failure of a DB2 member or a z/OS LPAR or a server is greatly diminished (only data pages and/or rows that were in the process of being changed by programs running on a failed DB2 subsystem are temporarily unavailable following the failure, and those retained locks will be usually be freed up within a couple of minutes via automatic restart of the failed member). And scalability? Up to 32 DB2 subsystems (which could be running on 32 different mainframe servers) can be configured in one data sharing group.&lt;br /&gt;&lt;br /&gt;Now, you can set up a DB2 data sharing group the right way, or the Really Right way. Company XYZ did it Really Right. Here's what I mean:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;More z/OS LPARs and DB2 members than mainframes in the sysplex.&lt;/span&gt; I like having more than one z/OS LPAR (and DB2 subsystem) per mainframe in a parallel sysplex, because 1) you can route work away from one of the LPARs for DB2 or z/OS maintenance purposes and still have access to that server's processing capacity, and 2) more DB2 members means fewer retained locks and quicker restart in the event of a DB2 subsystem failure.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;Dynamic VIPA network addressing&lt;/span&gt;. Availability and operational flexibility are optimized when remote DRDA clients use a dynamic VIPA (virtual IP address) to connect to the DB2 data sharing group (as long as at least one member of the data sharing group is up, a connection request specifying the group's VIPA can be successfully processed). A sysplex software component called the Sysplex Distributor handles load balancing across DB2 members for initial connection requests from remote systems (these will often be application servers), while load balancing for subsequent requests is managed at the DB2 member level.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;Internal coupling facilities&lt;/span&gt;. ICFs (basically, coupling facility control code running in an LPAR on a mainframe server) are less expensive than external coupling facilities, not only with respect to acquisition cost, but also in terms of environmental expenses (floor space, power, cooling). [It's true that if the mainframe containing the ICF holding the lock structure and the shared communications area (SCA) should fail, and if on that mainframe there is also a member of the DB2 data sharing group, the result will be a group-wide outage unless the lock structure and SCA are duplexed in the second ICF. Company XYZ went with system-managed duplexing of the lock structure and SCA (DB2 manages group buffer pool duplexing in a very low-overhead way). Some other organizations using ICFs exclusively (i.e., no external coupling facilities) decide not to pay the overhead of system-managed lock structure and SCA duplexing, on the ground that a) a mainframe server failure is exceedingly unlikely, b) the group-wide outage would only occur if a &lt;span style="font-style: italic;"&gt;particular&lt;/span&gt; mainframe (the one with the ICF in which the lock structure and SCA are located) were to fail, and c) the group-restart following a group-wide outage should complete within a few minutes. The right way to go regarding the use or non-use of system-managed lock structure and SCA duplexing will vary according to the needs of a given organization.]&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Taking advantage of 64-bit addressing.&lt;/span&gt; Each of the LPARs in the parallel sysplex on which Company XYZ's model DB2 for z/OS data-serving system is built has more than 20 GB of central storage, and each DB2 subsystem (there is one per LPAR) has a buffer pool configuration that exceeds 10 GB in size. In these days of Big Memory (versus the paltry 2 GB to which we were limited not long ago), I don't think of a production-environment DB2 buffer pool configuration as being large unless the aggregate size of all pools in the subsystem is at least 10 GB. The reduced level of disk I/O activity that generally comes with a large buffer pool configuration can have a significant and positive impact on both the elapsed time and CPU efficiency of data access operations.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Lean, mean, data-serving machines.&lt;/span&gt; A production instance of DB2 for Linux, UNIX, and Windows (LUW) usually runs on a machine that is a &lt;span style="font-style: italic;"&gt;dedicated&lt;/span&gt; data server -- data access code executes there, and that's it. Business-logic programs? They run on application servers. Presentation-logic programs? They might run on yet another tier of servers. The DB2 for LUW server Just Does Data. When I started my IT career in the early 1980s, a mainframe-based application was almost always entirely mainframe-based, by which I mean that all application functionality -- data access logic, business logic, and presentation logic -- was implemented in programs that ran on a mainframe server. Nowadays, I believe that the unmatched availability, scalability, reliability, and security offered by the System z platform is put to most advantageous use in the servicing of data access requests. In other words, I feel that a DB2 for z/OS system should be thought of, in an architectural sense, as a dedicated data server, just as we tend to think of DB2 for LUW systems (and other database management systems that run on Linux, UNIX, and/or Windows platforms) as dedicated data servers.&lt;br /&gt;&lt;br /&gt;That's how DB2 for z/OS functions in Company XYZ's reference architecture: it just does data. Consequently, the software stack on the data-serving mainframes is relatively short, consisting of z/OS, DB2, RACF (security management), a data replication tool, some system automation and  management tools, some performance monitoring tools, and little else. Transaction management is handled on application servers. Database access requests come in via the DB2 Distributed Data Facility (DDF), and much of the access logic is packaged in stored procedures (the preference at Company XYZ is DB2 9 native SQL procedures, because they perform very well and -- when invoked through calls that come through DDF -- much of their processing can be handled by zIIP engines).&lt;br /&gt;&lt;br /&gt;Does this system, which looks so good on paper, deliver the goods? Absolutely. Volume has been taken north of 1400 transactions per second with excellent response time, and my DBA friend is confident that crossing the 2000-trans-per-second threshold won't be a problem. On the availability side, Company XYZ is getting industry-leading uptime. The message: System z is more than just capable of functioning effectively as a dedicated data server -- it works exceptionally well when used in that way. This is a clean, modern architecture that leverages what mainframes do best -- scale, serve, protect, secure -- in a way that addresses a wide range of application design requirements.&lt;br /&gt;&lt;br /&gt;Here's a good coda for you: still at IOD in Las Vegas, and shortly after my conversation with the DBA from Company XYZ, I encountered another friend -- a lead DB2 technical professional at another company. He told me about the new DB2 for z/OS reference architecture that had recently been approved by his organization's IT executive management. The pillars of that architecture are a DB2 data sharing / parallel sysplex mainframe cluster, z/OS systems functioning as dedicated data servers, data requests coming in via the DB2 DDF, and data access logic packaged in DB2 9 native SQL procedures. I told this friend that he and his colleagues are definitely on the right track. It's a track that more and more DB2 for z/OS-using companies are traveling, and it could well be the right one for your organization.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-6068949536764372656?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/6068949536764372656/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/11/mainframe-db2-data-serving-vision.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6068949536764372656'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6068949536764372656'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/11/mainframe-db2-data-serving-vision.html' title='Mainframe DB2 Data Serving: Vision Becomes Reality'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-3558234736129481063</id><published>2009-11-11T07:17:00.000-08:00</published><updated>2011-09-20T08:52:11.522-07:00</updated><title type='text'>DB2 9 for z/OS: Converting from External to Native SQL Procedures</title><content type='html'>&lt;span style="font-family:arial;"&gt;As some of you may know, I'm a big fan of the native SQL procedure functionality introduced with DB2 for z/OS Version 9 (I've written a number of blog entries on the subject, &lt;a href="http://catterallconsulting.blogspot.com/2008/11/db2-9-for-zos-stored-procedure-game.html"&gt;starting with one posted last year&lt;/a&gt;). Native SQL procedures offer a number of advantages versus external SQL procedures (formerly known simply as SQL procedures in pre-Version 9 DB2 environments), including (generally) better performance, zIIP engine eligibility when called from a remote client via DRDA, and simplified lifecycle processes (referring to development, deployment, and management). These advantages have plenty of folks looking to convert external SQL procedures to native SQL procedures, and that's fine, but some of these people are under the impression that the conversion process involves nothing more than dropping an external SQL procedure and re-issuing that routine's CREATE PROCEDURE statement, minus the EXTERNAL NAME and FENCED options (if either had been specified in creating the external SQL procedure). This may in fact do the trick for a very simple SQL procedure, but in many cases the external-to-native conversion will be a more involved process. In this post I'll provide some information as to why this is so, along with a link to a well-written "technote" on IBM's Web site that contains further details on the topic.&lt;br /&gt;&lt;br /&gt;First, a little more on this drop-and-recreate-without-EXTERNAL-NAME-or-FENCED business. It is true that, in a DB2 9 New Function Mode system, a SQL procedure (i.e., a stored procedure for which the routine source is contained within the CREATE PROCEDURE statement) will be external if it is created with the EXTERNAL NAME and/or FENCED options specified, and native if created with neither EXTERNAL NAME nor FENCED specified; however, it is not necessarily the case that an external SQL procedure re-created sans EXTERNAL NAME and FENCED will behave as you want it to when executed as a native SQL procedure. Why is this so? Well, some of the reasons are kind of obvious when you think about it. Others are less so. On the obvious side, think about options that you'd specify for an external SQL procedure (which ends up becoming a C language program with embedded SQL) at precompile time (e.g., VERSION, DATE, DEC) and at bind time (e.g., QUALIFIER, CURRENTDATA, ISOLATION). For a native SQL procedure, there's nothing to precompile (as there is no associated external-to-DB2 program), and the package is generated as part of CREATE PROCEDURE execution (versus by way of a separate BIND PACKAGE step). That being the case, these options for a native SQL procedure are specified via CREATE PROCEDURE options (some of which have names that are slightly different from the corresponding precompile options, an example being the PACKAGE OWNER option of CREATE PROCEDURE, which corresponds to the OWNER option of the BIND PACKAGE command). Here's another reason to pay attention to these options of CREATE PROCEDURE when converting an external SQL procedure to a native SQL procedure: the default options for CURRENTDATA and ISOLATION changed to NO and CS, respectively, in the DB2 9 environment.&lt;br /&gt;&lt;br /&gt;A less-obvious consideration when it comes to external-to-native conversion of SQL procedures has to do with condition handlers. These are statements in the SQL procedure that are executed in the event of an error or warning situation occurring. External SQL procedures do not allow for nested compound SQL statements (a compound SQL statement is a set of one or more statements, delimited by BEGIN and END, that is treated as a block of code); so, if you had within an external SQL procedure a compound SQL statement, and you wanted within that compound SQL statement a multi-statement condition handler, you couldn't do that by way of a nested compound statement. What people would often do instead in that case is code the condition handler in the form of an IF statement containing multiple SQL statements. In converting such an external SQL procedure to a native SQL procedure, the IF-coded condition handler should be changed to a nested compound SQL statement set off by BEGIN and END (in fact, it would be a good native SQL procedure coding practice to bracket even a single-statement condition handler with BEGIN and END). This change would be very much advised not just because nested compound statements are allowed in a native SQL procedure, but also because, in a native SQL procedure, an IF statement intended to ensure execution of multiple statements in an IF-based condition handler (e.g., IF 1=1 THEN...) would itself clear the diagnostics area, thereby preventing (most likely) the condition handler from functioning as desired (in an external SQL procedure, it so happens that a trivial IF statement such as IF 1=1 will not clear the diagnostics area).&lt;br /&gt;&lt;br /&gt;Also in the not-so-obvious category of reasons to change the body of a SQL procedure when converting from external to internal: resolution of unqualified parameter, variable, and column names differs depending on whether a SQL procedure is external or native. Technically, there's nothing to prevent you from giving to a parameter or variable in a SQL procedure a name that's the same as one used for a column in a table that the SQL procedure references. If a statement in an external SQL procedure contains a name that could refer to a variable or a parameter or a column, DB2 will, in processing that statement, check to see if a variable of that name has been declared in the SQL procedure. If a matching variable name cannot be found, DB2 will check to see if the name is used for one of the procedure's parameters. If neither a matching variable nor a matching parameter name is found, DB2 will assume that the name refers to a column in a table referenced by the procedure. If the same statement is encountered in a native SQL procedure, DB2 will first check to see if the name is that of a column of a table referenced by the procedure. If a matching column name is not found, DB2 will then look for a matching variable name, and after that for a matching parameter name. If no match is found, DB2 will return an error if VALIDATE BIND was specified in the CREATE statement for the native SQL procedure (if VALIDATE RUN was specified, DB2 will assume that the name refers to a column, and will return an error if no such column is found at run time). Given this difference in parameter/variable/column name resolution, it would be a good idea to remove ambiguities with respect to these names in an external SQL procedure prior to converting the routine to a native SQL procedure. This could be done either through a naming convention that would distinguish variable and parameter names from column names (perhaps by prefixing variable and parameter names with v_ and p_, respectively) or by qualifying the names. Variable names are qualified by the label of the compound statement in which they are declared (so if you're going to go this route, put a label before the BEGIN and after the END that frame the compound statement), parameter names are qualified by the procedure name, and column names are qualified by the name of the associated table or view.&lt;br /&gt;&lt;br /&gt;Then there's the matter of the package collection name that will be used when the SQL procedure is executed. For an external SQL procedure, this name can be specified via the COLLID option of the CREATE PROCEDURE statement. [If NO COLLID -- the default -- is specified, the name will be the same as the package collection of the calling program. If the calling program does not use a package, the SQL procedure's package will be resolved using the value of CURRENT PACKAGE PATH or CURRENT PACKAGESET, or the plan's PKLIST specification.] When a native SQL procedure is created, the name of the associated package's collection will be the same as the procedure's schema. In terms of external-to-native SQL procedure conversion, here's what that means:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;If external SQL procedures were created with a COLLID value equal to the procedure's schema, it's smooth sailing ahead.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;If external procedures were create with a COLLID value other than the procedure's schema, a relatively minor adjustment is in order. This adjustment could take one of two forms: a) go with the one "root" package for the native SQL procedure in the collection with the name matching the routine's schema, and ensure that this collection will be searched when the procedure is called, or b) &lt;/span&gt;&lt;span style="font-family:arial;"&gt;add a SET CURRENT PACKAGESET statement to the body of the SQL procedure, specifying the collection name used for COLLID in creating the external SQL procedure, and (&lt;/span&gt;&lt;span style="font-family:arial;"&gt;via BIND PACKAGE COPY) place a copy of the "root" package of the native SQL procedure in that collection.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;If external SQL procedures were bound with NO COLLID, there could be a good bit of related work in converting those routines to native SQL procedures, especially if a number of "variants" of an external SQL procedure's "root" package were generated and placed in different collections. The external SQL procedure package variants will have to be identified, SET CURRENT PACKAGESET will be needed in the native SQL procedure to navigate to the desired collection at run time (perhaps using a value passed by the caller as a parameter), and variants of the native SQL procedure's "root" package (again, that being the one in the collection with the name matching the procedure's schema) will need to be copied into the collections in which the external SQL procedure package variants had been placed.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;The information I've provided in this blog entry is not exhaustive with respect to external-to-native SQL procedure conversion -- my intent was to cover the major issues that should be taken into consideration in planning your conversion process. For more details, check out the excellent "technote" document written by Tom Miller, a senior member of IBM's DB2 for z/OS development team and an authority on SQL procedures (both external and native). To access this document, go to the &lt;a href="http://www-01.ibm.com/software/data/db2/support/db2zos/"&gt;DB2 for z/OS Support page&lt;/a&gt; on IBM's Web site, enter "external native" as your search terms, and click on the search button -- Tom's document should be at the top of the search result list.&lt;br /&gt;&lt;br /&gt;Native SQL procedures are the way of the future, and I encourage you to develop a plan for converting external SQL procedures to native SQL procedures (if you have any of the former). It's very do-able, and for some external SQL procedures the conversion will in fact be very straightforward. For others, more conversion effort will be required, but the payoff should make that extra effort worthwhile.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-3558234736129481063?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/3558234736129481063/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/11/db2-9-for-zos-converting-from-external.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3558234736129481063'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3558234736129481063'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/11/db2-9-for-zos-converting-from-external.html' title='DB2 9 for z/OS: Converting from External to Native SQL Procedures'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-1080412495610804166</id><published>2009-11-03T12:37:00.000-08:00</published><updated>2009-11-03T18:50:25.599-08:00</updated><title type='text'>IBM IOD 2009 - Day 4 (Belated Re-Cap)</title><content type='html'>&lt;span style="font-family: arial;"&gt;Apologies for the delay in getting this entry posted to my blog -- the time since IBM's 2009 Information On Demand conference concluded on October 29 has been very busy for me. Now I have a little downtime, so I can share with you what I picked up on day 4 of the conference.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;"Not Your Father's Database System," indeed -&lt;/span&gt; Guy Lohman, of IBM's Almaden (California) Research Center, delivered a very interesting presentation on the Smart Analytics Optimizer, a just-around-the-corner product (meaning, not yet formally announced) about which you'll be hearing a lot in the weeks and months to come. &lt;/span&gt;&lt;span style="font-family: arial;"&gt;Developed jointly by the Almaden Research Center and IBM's Silicon Valley and Boeblingen (Germany) software labs,&lt;/span&gt;&lt;span style="font-family: arial;"&gt; the IBM Smart Analytics Optimizer (ISAO) is a business intelligence query-acceleration system that network-attaches to a mainframe server running DB2 for z/OS. The way it works: using a GUI, a DBA copies a portion of a data warehouse (one or more star schemas -- fact tables and their associated dimension tables) to the ISAO (in effect, you set up a data mart on the ISAO). Thereafter, queries that are submitted to DB2 (the ISAO is transparent from a user perspective) will be routed by DB2 to the ISAO if 1) the queries reference tables that have been copied to the ISAO, and 2) DB2 determines that they will run faster if executed on the ISAO. Here's the interesting part: the longer a query would run if executed in the DB2 system, the greater the degree of acceleration you'll get if it runs on the ISAO.&lt;br /&gt;&lt;br /&gt;When I say "acceleration," I mean big-time speed-up, as in &lt;span style="font-style: italic;"&gt;ten to one hundred times&lt;/span&gt; improvement in query run times (the ISAO "sweet spot" is execution of queries that contain aggregation functions -- such as AVERAGE and SUM -- and a GROUP BY clause). How is this accomplished? The ISAO hardware is commodity stuff: multi-core microprocessors with a lot of server memory in a blade center configuration (and several of these blade centers can be tied together in one ISAO system). The query processing software that runs on the ISAO hardware is anything but commodity -- it's a built-from-the-ground-up application that implements a hybrid row-store/column-store in-memory data server. Want DBA ease-of-use? You've got it: there's no need to implement indexes or materialized query tables or any other physical database design extras in order to get great performance for otherwise long-running queries. This is so because the ISAO does a simple thing -- scan data in one or more tables -- in a very advanced, multi-threaded way to deliver &lt;span style="font-style: italic;"&gt;consistently&lt;/span&gt; good response time (typically less than 10 seconds) for most any query sent its way by DB2. [Caveat: As the ISAO does no I/Os (all data that it accesses is always in memory), it runs its CPUs flat-out to get a single query done as quickly as possible before doing the same for the next query; thus, if &lt;/span&gt;&lt;span style="font-family: arial;"&gt;queries are sent to the ISAO by DB2 at a rate that exceeds the rate at which the ISAO can process the queries, response times could increase to some degree -- this is just basic queuing theory.]  &lt;/span&gt;&lt;span style="font-family: arial;"&gt;&lt;br /&gt;&lt;br /&gt;The ISAO is what's known as disruptive technology. As previously mentioned, you'll soon be hearing a lot more about it (the IOD session I attended was a "technology preview"). I'll be watching that space for sure.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;A DB2 for z/OS data warehouse tune-up -&lt;/span&gt; Nin Lei, who works at IBM's System z benchmark center in Poughkeepsie (New York), delivered a presentation on performance management of a data warehouse mixed query workload ("mixed" referring to a combination of short- and long-running queries). A couple of the points made in the course of the session:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;You might want to cap the degree of query parallelization on the system&lt;/span&gt; - There is a DB2 for z/OS ZPARM parameter, PARAMDEG, that can be used to set an upper limit on the degree to which DB2 will split a query for parallelized execution. For some time now, I've advocated going with a PARAMDEG value of 0 (the default), which leaves the max degree of parallelization decision up to DB2. Nin made a good case for setting PARAMDEG to a value equal to twice the number of engines in the z/OS LPAR in which DB2 is running. I may rethink my PARAMDEG = 0 recommendation.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-style: italic;"&gt;The WLM_SET_CLIENT_INFO stored procedure is available on the DB2 for z/OS platform, too&lt;/span&gt; - This stored procedure, previously available only on the DB2 for Linux/UNIX/Windows and DB2 for System i platforms, was added to mainframe DB2 V8 and V9 environments via the fix for APAR &lt;a href="http://www-01.ibm.com/support/docview.wss?uid=swg1PK74330"&gt;PK74330&lt;/a&gt;. WLM_SET_CLIENT_INFO can be used to change the value of the so-called client special registers on a DB2 for z/OS server (CURRENT CLIENT_ACCTNG, CURRENT CLIENT_USERID, CURRENT CLIENT_WRKSTNNAME, and CURRENT CLIENT_APPLNAME). This capability provides greater flexibility in resource management and monitoring with respect to a query workload.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;For fans of Big Memory -&lt;/span&gt; Chris Crone, Distinguished Engineer and member of the DB2 for z/OS team at IBM's Silicon Valley Lab, gave a presentation on 64-bit addressing in the mainframe DB2 environment. He said that development of this feature was motivated by a recognition that memory had become the key DB2 for z/OS system resource constraint as System z engines became faster and more numerous (referring to the ability to configure more central processors in a single z/OS image). Big DB2 buffer pools are needed these days because even a really fast I/O operation (involving a disk subsystem cache hit versus a read from spinning disk) can be painfully slow when a single mainframe engine can execute almost 1000 million instructions per second.&lt;br /&gt;&lt;br /&gt;Here are a few of the many interesting items of information provided in Chris's session:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;You can currently get up to 1.5 TB of memory on a System z server. Expect memory sizes of 3 TB or more in the near future.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;The largest buffer pool configuration (aggregate size of all active buffer pools in a subsystem) that Chris has seen at a DB2 for z/OS site is 40 GB.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;It is expected that the default RID pool size will be 400 MB in the next release of DB2 for z/OS (the RID pool in the DB2 database services address space is used for RID sort operations related to things such as multi-index access, list prefetch, and hybrid join).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;The maximum size of the EDM pool components (EDM pool, skeleton pool, DBD pool, and statement pool) is expected to be much larger in the next release of DB2 (commonly referred to as DB2 X -- we'll get the actual version number at announcement time).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;In the DB2 X environment, it's expected that 80-90% of the virtual storage needed for DB2 threads will be above the 2 GB "bar" in the DB2 database services address space. As a result, the number of threads that can be concurrently active will go way up with DB2 X (expect an upper limit of 20,000 for a subsystem, versus 2000 today).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;DB2 data sharing groups (which run in a parallel sysplex mainframe cluster) could get really big -- IBM is looking at upping the limit on the number of DB2 subsystems in a group (currently 32).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Solid state storage is going to be a big deal -- the DB2 development team is looking at how to best leverage this technology.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;After Chris's session, it was off to the airport to catch the red-eye back to Atlanta. I had a great week at IOD, and I'm looking forward to another great conference next year.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-1080412495610804166?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/1080412495610804166/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/11/ibm-iod-2009-day-4-belated-re-cap.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1080412495610804166'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1080412495610804166'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/11/ibm-iod-2009-day-4-belated-re-cap.html' title='IBM IOD 2009 - Day 4 (Belated Re-Cap)'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-2574333167638318492</id><published>2009-10-28T20:34:00.000-07:00</published><updated>2011-09-20T08:48:38.550-07:00</updated><title type='text'>IBM IOD 2009 - Day 3</title><content type='html'>&lt;span style="font-family:arial;"&gt;Following is some good stuff that I picked up during the course of day 3 of IBM's 2009 Information on Demand conference:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;IBM Information Management software executives had some interesting things to say -&lt;/span&gt; IBM got some of us bloggers together with some software execs for a Q&amp;amp;A session. A few highlights:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Interest in DB2 pureScale, the recently announced shared-data cluster for the DB2/AIX/Power platform, is strong. Demo sessions at the conference this week were full-up.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;It used to be that organizations asked IBM about &lt;span style="font-style: italic;"&gt;products. &lt;/span&gt;These days, companies are increasingly likely to ask about &lt;span style="font-style: italic;"&gt;capabilities&lt;/span&gt;. IBM is responding by packaging software (and sometimes hardware) products into integrated offerings designed to fulfill these capability requirements.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;New products at the upper end of IBM's information transformation software stack are driving requirements at the foundational level of the stack (where you'll find the database engines such as DB2), and even into IBM's hardware platforms (such as the Power Systems server line).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Regarding software-as-a-service (SaaS) and cloud computing, IBM sees a "broadening of capabilities" with respect to software delivery and pricing models.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;The IBM folks in the room were pretty keyed up about the Company's new &lt;a href="http://www.prnewswire.com/news-releases/ibm-brings-business-analytics-and-cloud-services-to-smarter-archiving-66011682.html"&gt;Smart Archive&lt;/a&gt; offerings, which can - among other things - drive cost savings by using discovery and analytics capabilities to determine which information (structured and unstructured data) an organization has to retain and archive.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Jeff Jonas, one of IBM's top scientists, talked about the huge increase in the amount of data streaming into many companies' systems (much of it from various sensors that emit various signals). People may assume that their organization cannot manage this informational in-surge, but Jeff noted that the more data you get into your system, the faster things can go ("It's like a jigsaw puzzle: the more pieces you put together, the more quickly you can correctly place other pieces").&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Jeff also spoke of "enterprise amnesia:" a firm has so much information with which to deal that it loses track of some of it. Consequently, a large retailer will sometimes hire a person who had previously been fired &lt;span style="font-style: italic;"&gt;for stealing from that same company&lt;/span&gt;.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Let's hear it for audience participation -&lt;/span&gt; I enjoyed delivering my presentation on DB2 for z/OS data warehouse performance. As usual, I got some great questions and comments from session attendees. After I mentioned that I'm usually comfortable with having more indexes on tables in a data warehouse versus an OLTP data-serving environment (I wrote of this in &lt;a href="http://catterallconsulting.blogspot.com/2008/09/db2-for-zos-data-warehousing-query.html"&gt;blog entry&lt;/a&gt; posted last year), I was asked if that statement applied to data warehouses that are updated in near-real time relative to source data changes (something that more organizations are doing these days). My response: in a continuously-updated data warehouse (versus a data warehouse updated via an overnight extract/transform/load process), I'd probably be more conservative when it comes to indexing tables.&lt;br /&gt;&lt;br /&gt;After I'd covered DB2 query parallelism, a session attendee suggested that in a CPU-constrained mainframe DB2 data warehouse system, adding one or more zIIP engines and turning on query parallelism (something that probably wouldn't be activated in a system with little in the way of CPU head room) could provide a double benefit: more cycles to enable beneficial utilization of DB2's query parallelism capability, and a workload (parallelized queries) that could drive utilization of the cost-effective zIIPs. Spot on - couldn't have said it better myself (I wrote about query parallelism and zIIP engines in a comment that I added to a &lt;a href="http://catterallconsulting.blogspot.com/2008/08/db2-for-zos-performance-management-data.html"&gt;blog entry that I posted last year&lt;/a&gt;).&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Bernie Spang is a man on a mission -&lt;/span&gt; IBM's Director of Strategy and Marketing for InfoSphere and Information Management software wants companies to have trusted information. Too often, people confuse "trusted" with "secure."  "Secure" is important, but "trusted," in this context, refers to data that is reliable, complete, and correct - the kind of data on which you could confidently base important decisions. Bernie is out to make IBM's InfoSphere portfolio the go-to solution for organizations wanting to get to a trusted-information environment. There's a lot there: data architecting, discovery, master data management, and data governance are just a few of the capabilities that can be delivered by way of various InfoSphere offerings. It's all about getting a handle on the state of your data assets, rationalizing inconsistencies and discrepancies, and providing an interface that leads to agreed-upon "true" values (and this has plenty to do with integrating formerly siloed data stores). If you want to get your data house in order, there's a way to get that done.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Chris Eaton wants mainframe DB2 people to be at ease with DB2 for LUW lingo -&lt;/span&gt; Chris, one of the technical leaders in the DB2 for Linux, UNIX, and Windows development organization at IBM's Toronto Lab, knows that there are some DB2 for LUW concepts and terminologies that are a little confusing to mainframe DB2 folks, and he wants to clear things up. SQL data manipulation language statements are virtually identical across DB2 platforms, but there are some differences in the DBA and systems programming views of things on the mainframe and LUW platforms (largely a reflection of significantly different operating system and file system architectures and interfaces). In a session on DB2 for LUW for mainframe DB2 people, Chris explained plenty. Some examples:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;A copy of DB2 running on an LUW server is an instance.  A copy of DB2 running on a mainframe server is a subsystem.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;A DB2 for z/OS subsystem has its own catalog. A DB2 for LUW database, several of which can be associated with a DB2 instance, has its own catalog (and its own transaction log - something else that's identified with a subsystem in a mainframe DB2 environment).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;So-called installation parameter values are associated with a DB2 subsystem on a mainframe (most of these values are specified in a module known as ZPARM). The bulk of DB2 for LUW installation parameter values are specified at the database level.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;A DB2 for z/OS thread is analogous to a DB for LUW agent, and a DB2 for LUW thread is analogous to a mainframe DB2 TCB or SRB (i.e., a dispatchable piece of work in the system).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;A mainframe DB2 data set would be called a file in a DB2 for LUW environment, and a mainframe address space would be referred to as memory on an LUW server.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;The DB2 for LUW lock list is what mainframe people would call the IRLM component of DB2.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;The DB2 for LUW command FORCE APPLICATION is analogous to the -CANCEL THREAD command in a DB2 for z/OS environment.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;Chris also passed on some hints and tips:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Self-tuning memory management (the ability for DB2 to automatically monitor and adjust amounts of memory used for things such as page buffering, package caching, and sorting) works very well on the LUW platform, and Chris recommends use of this feature.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Chris favors the use of DMS files (versus SMS) in a DB2 for LUW system, and the use of automatic-storage databases over DMS files for most objects in a DB2 database.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Chris is big on the use of administrative views as a means of easily obtaining DB2 for LUW performance and system information using SQL.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;Tomorrow is the last day of the conference. More blogging to come.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-2574333167638318492?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/2574333167638318492/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/10/ibm-iod-2009-day-3.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2574333167638318492'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2574333167638318492'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/10/ibm-iod-2009-day-3.html' title='IBM IOD 2009 - Day 3'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-4020592815259888989</id><published>2009-10-27T21:49:00.000-07:00</published><updated>2011-09-20T08:43:48.649-07:00</updated><title type='text'>IBM IOD 2009 - Day 2</title><content type='html'>&lt;span style="font-family:arial;"&gt;Another day done at IBM's 2009 Information on Demand Conference - another day of learning more about DB2, and about technologies used at higher levels of the information transformation software stack. Some take-aways from today's sessions follow.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;A good DB2 9 for z/OS  migration story-&lt;/span&gt; Maria McCoy of the UK Land Registry delivered a very good presentation on her organization's DB2 9 for z/OS migration experience. The Land Registry has one of the world's largest operational (versus decision support) databases, holding almost 40 TB of data. On top of that, the agency recently launched it's first &lt;span style="font-style: italic;"&gt;public&lt;/span&gt; e-business application, a consequence being that downtime is even less well tolerated than before.&lt;br /&gt;&lt;br /&gt;The Land Registry runs DB2 in data sharing mode on a parallel sysplex mainframe cluster. The number of DB2 subsystems across all of the Land Registry's  environments (test, development, and production) is about 30.&lt;br /&gt;&lt;br /&gt;The DB2 9 migration effort went off well, largely because the Land Registry stays pretty current on system maintenance, with quarterly upgrades of the DB2 service level (Maria confirmed what others have said, indicating that DB2 9 is very stable at the F906 maintenance level and beyond).&lt;br /&gt;&lt;br /&gt;For the Land Registry, the primary DB2 9 migration drivers included:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;XML support&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Spatial data support (spatial awareness had historically been achieved by way of user-written code)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Extensions to online schema changes&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Further exploitation of 64-bit addressing&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Improved utility CPU efficiency&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Indexes on column expressions&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Real-time statistics (especially the capability of identifying indexes that have gone a long time without being used for data access)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Larger index page sizes (offering potentially reduced GETPAGE activity due to a  reduction in the  number of index levels)&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;An important part of the Land Registry's DB2 9 migration planning effort involved identification of third-party tools used with DB2. The agency identified 42 such products, among these being monitors, middleware, compilers, utilities, file management systems, and legacy software.&lt;br /&gt;&lt;br /&gt;A dedicated test system proved to be very valuable. The LoadRunner tool was used to drive online transaction test scripts.&lt;br /&gt;&lt;br /&gt;Following the migration to DB2 9, the Land Registry converted all existing simple tablespaces to segmented tablespaces (a good idea, as simple tablespaces can no longer be created in a DB2 9 environment). Maria and her colleagues thought that there were no simple tablespaces in their DB2 databases, but it turned out that 41 such tablespaces did exist.&lt;br /&gt;&lt;br /&gt;Among the DB2 9 new features put to good use by the Land Registry are the following:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Indexes on column expressions (thus was achieved a HUGE decrease in  CPU  time for a batch job containing a query with a predicate involving a column in a substring function)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Clone tables (a table-data-change outage that formerly  ran to 5 hours due to time needed to load new data and to inspect the newly loaded data for correctness went to 2 seconds)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Rename column&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Rename index&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;The DB2 9 migration project went from beginning to end in about 12 months.  The Land Registry ran with DB2 9 in Conversion Mode for about 2 months in each of their DB2 environments prior to moving to Enable New Function Mode and then to New Function Mode.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The current Information Management software scene - &lt;/span&gt;Arvind Krishna, General Manager of IBM's Information Management software business, spoke during a keynote presentation of the challenges faced by organizations dealing with explosive information growth  (an estimated 15 petabytes of new data are generated daily - that's about 50 exabytes per year). He went on to talk about the benefits of "workload-optimized systems" being brought to market now by IBM - systems comprised of fully integrated hardware and software offerings that are optimized for specific workloads. An example of a workload-optimized system is IBM's Smart Analytics system, which provides hardware and a comprehensive software stack (with data management, warehousing, and analytics software) in one package that can be quickly and effectively deployed.&lt;br /&gt;&lt;br /&gt;Ross Mauri, General Manager of IBM's Power Systems business (formerly called System p), provided information on the current state of the Power line (currently utilizing generation 6 of IBM's RISC-based microprocessor family, with generation 7 now in beta test mode). Ross said that "Power is everywhere," not only in IBM's Power servers but also in supercomputers, cars, all three of the major electronic game consoles, and the Mars Rover ("we have 100% market share on Mars"). From around 17% market share a few years ago, Power systems now has more than 40% of the market for RISC processor-based servers. Particular strengths of the server line include efficiency ("work per watt," as Ross put it), virtualization, management, and resiliency.&lt;br /&gt;&lt;br /&gt;Arvind Krishna closed out the keynote session with remarks that spotlighted IBM's close partnership with SAP (the companies have joint development teams and tens of thousands of mutual customers).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;DB2 9 for z/OS native SQL procedures are looking very good -&lt;/span&gt; Philip Czachorowski of Fidelity Investments presented information related to his company's early experiences with the native SQL procedures feature of DB2 9 for z/OS (I've blogged a number of times on this technology, beginning with &lt;a href="http://catterallconsulting.blogspot.com/2008/11/db2-9-for-zos-stored-procedure-game.html"&gt;an entry posted late last year&lt;/a&gt;). Philip talked about DDL extensions that help with the migration of native SQL procedures from development to test to production environments (statements such as ALTER PROCEDURE ADD VERSION and ALTER PROCEDURE ACTIVATE VERSION), and the new SET CURRENT ROUTINE VERSION statement that can facilitate the testing of a new native SQL procedure (Philip also stressed the importance of having a good naming convention for SQL procedure version identifiers, so you'll know what you're executing when running tests).&lt;br /&gt;&lt;br /&gt;Performance data presented during the session was most interesting. Philip showed monitor data for one case in which total class 1 CPU time (from a DB2 monitor accounting report) for a native SQL procedure was only 4% greater than that of a comparable stored procedure written in COBOL.&lt;br /&gt;&lt;br /&gt;Near the end of his presentation, Philip mentioned that the DB2_LINE_NUMBER clause of the GET DIAGNOSTICS statement could be very helpful in terms of resolving native SQL procedure code problems.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Stream analytics is way cool -&lt;/span&gt; Just before dinner, those of us participating in the IOD Blogger Program had an opportunity to spend an hour with IBMers who are working on the System S "stream analytics" technology on which IBM's &lt;a href="http://www-01.ibm.com/software/data/infosphere/streams/"&gt;InfoSphere Streams&lt;/a&gt; offering is based. This is cool stuff: stream analytics software, running under Linux on commodity hardware, can be used to analyze vast amounts of incoming data - often signal data produced by various sensors - to identify events or episodes as they occur, thereby enabling a very rapid response capability. The data could be structured or unstructured, and might consist of  hydrophone-captured sounds (picking up, perhaps, the clicking of dolphins), radio astronomy signals, manufacturing data, vehicular traffic activity, weather data, telephone communications, or human-health indicators. Picking up on this latter stream category, a specialist in neonatology who has worked with the IBM System S team spoke of her work involving the  monitoring of premature infants' vital signs. An electrocardiogram can generate 500 data signals per second, and there are other vital-sign streams that can be analyzed as well (e.g., blood flow data), and all this can be multiplied by several infants in one area being monitored concurrently (important, as an infection in one child could quickly spread to others). System S stream analytics technology is demonstrating the potential to save lives by taking anomaly detection time from 24 hours (using traditional monitoring methods) to seconds.&lt;br /&gt;&lt;br /&gt;The IBM researchers then demonstrated the use of System S stream analytics software  to analyze automobile traffic patterns in Stockholm, Sweden (500,000 pieces of GPS data per second).&lt;br /&gt;&lt;br /&gt;The scalability of the System S technology is remarkable, the programming interface is surprisingly straightforward (people familiar with object-oriented programming languages tend to become proficient in a couple of weeks), and the GUI is pretty intuitive. Who knows how broadly applicable it might end up being (early adopters are largely in the government and health-care industries, but oil companies are also showing interest)? Watch this space, folks.&lt;br /&gt;&lt;br /&gt;That's it for now. Tomorrow morning I'll deliver a presentation on DB2 for z/OS data warehouse performance, and tomorrow evening I'll try to post another blog entry.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-4020592815259888989?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/4020592815259888989/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/10/ibm-iod-2009-day-2.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/4020592815259888989'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/4020592815259888989'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/10/ibm-iod-2009-day-2.html' title='IBM IOD 2009 - Day 2'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-5854850978039774814</id><published>2009-10-26T22:51:00.000-07:00</published><updated>2011-09-20T08:41:21.384-07:00</updated><title type='text'>IBM IOD 2009 - Day 1</title><content type='html'>&lt;span style="font-family:arial;"&gt;Greetings from Las Vegas.  Day one of IBM's 2009 Information on Demand conference was a good one. In this post I'll share with you some of the more interesting items of information I picked up in today's sessions. I'll post at the end of days 2, 3, and 4, as well.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The Big Theme: "Information-Led Transformation" -&lt;/span&gt; Ambuj Goyal, General Manager for Business Analytics and Process Optimization in IBM's Software Group, kicked off the Grand Opening session with an overview of the Company's Information Management software strategy. He pointed out that IBM has spent $12 billion on its information-on-demand software stack over the past 4 years: $8 billion on acquisitions (such as Cognos and SPSS) and $4 billion on internal development and related activity. That's some serious money, and it reflects the confidence of IBM's executives that we are on the front end of a major change in the way that organizations manage and leverage their data assets. Ambuj professed that information-led transformation would be even bigger in scope and impact than the enterprise resource planning software wave that got started about 20 years ago.&lt;br /&gt;&lt;br /&gt;Companies, said Ambuj, are transitioning from information-focused projects to the information-based enterprise - an operational model characterized by the use of rationalized and trusted data to make timely,  effective, and &lt;span style="font-style: italic;"&gt;predictive&lt;/span&gt; (versus reactive)  decisions. Frank Kern, a Senior Vice President in IBM's Global Business Services division, joined Ambuj onstage and continued to underscore the importance of organizations developing a predictive decision-making capability. He described  a new service line, Business Analytics and Optimization, that will be delivered by a  4000-strong team of consultants. He also talked about the irony of executives reporting  a lack of information needed to make good decisions, even as their organizations are awash in data as never before.&lt;br /&gt;&lt;br /&gt;During a panel discussion, several IT executives from IBM customer companies shared their experiences related to the use of advanced analytics software:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Shirley Lady of Blue Cross and Blue Shield, a health insurance company, said that "what if" analysis is more important to her organization than ever before, given the major market changes that could result from health care reform in the United States.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Nihad Aytaman, of clothing retailer Elie Tahari, talked about the importance of quick (as well as effective) decision making to his company's efforts to successfully "chase the business" in the very fluid world of fashion.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Debbie Oshman of Chevron, a  global energy company, mentioned that her organization was pursuing information-driven enterprise transformation, after having used information well in a project-by-project way. Process optimization and risk mitigation were described as being two key analytics-driven initiatives underway at Chevron.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;Following the panel discussion, Arvind Krishna, IBM's General Manager for Information Management software, talked about new developments in his part of the business: DB2 pureScale (about which I &lt;a href="http://catterallconsulting.blogspot.com/2009/10/wow-db2-data-sharing-comes-to-aixpower.html"&gt;recently blogged&lt;/a&gt;), smart archiving, smart analytics, a master information hub, InfoSphere streams, Cognos content analytics, and two recent acquisitions: SPSS and ILOG.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;An interesting press conference -&lt;/span&gt; I joined journalists, analysts, and fellow bloggers for a press conference featuring several senior IBM executives. Announced during the conference were new analytics applications, enhanced stream computing technology, and new Master Information Hub software (you can view the &lt;a href="http://www-03.ibm.com/press/us/en/pressrelease/28686.wss"&gt;press release&lt;/a&gt; on IBM's Web site).&lt;br /&gt;&lt;br /&gt;Steve Mills, Senior Vice President and IBM Software Group Executive, talked about a new information-related transformation in light of transformations past:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;The PC transformation of the 1980s that enabled personal delivery of information.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;The Worldwide Web transformation of the 1990s that made incredible levels of connectivity a reality.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;The process-focused transformation of the past decade, which led to improvements in efficiency and effectiveness.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;Going on now, said Mills, is an information-&lt;span style="font-style: italic;"&gt;led&lt;/span&gt; (not just information-focused) transformation, through which organizations are seeking to understand not only processes, but the environment in which they operate. The urgency of this transformation is prompted by two questions: 1) can your organization move &lt;span style="font-style: italic;"&gt;fast&lt;/span&gt; enough, and 2) can it move &lt;span style="font-style: italic;"&gt;smart&lt;/span&gt; enough? Helping to make the transformation possible are historically low costs for units of compute power;  human interface improvements, such as dashboards, that enable people to quickly absorb and act on information; and the ability to physically place information capture and analysis technology where it could not be placed before. Mills said that 35,000 IBM people are involved in building the Company's "portfolio of capability" regarding advanced analytics - a portfolio that includes software technology and the know-how to put that technology to work for organizations in all kinds of industries.&lt;br /&gt;&lt;br /&gt;The press conference concluded with a question-and-answer session. In responding to questions asked by session attendees:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Arvind Krishna said that IBM's Information Management software business had grown at a 14% annual rate over the past three years - in a market that grew at a 6% rate.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;It was mentioned that over 50 OEM vendors are delivering analytics capabilities via cloud computing systems using IBM's Cognos Express offering.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Steve Mills indicated that data governance is increasingly seen by organizations as being a mission-critical competence.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Ambuj Goyal said that even as IBM works to be a one-stop-shop provider of the information transformation software stack (software that manages, archives, cleanses, catalogs, integrates, and analyzes data), the Company designs its  products to use open standards that make it easier for organizations to use a mix of IBM and third-party products in a stack.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;It was explained that IBM is delivering software that can be used to analyze unstructured data on the Web (e.g., what customers are saying about your company's products), with an emphasis on combining that information with information generated using in-house data.&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;DB2 "X" is coming along just fine - &lt;/span&gt;The next release of DB2 for z/OS is mostly coded, with activity now focused mainly on  testing. Jeff Josten, an  Distinguished Engineer on the DB2 for z/OS development team at IBM's Silicon Valley Lab, provided a preview of this coming attraction. A few highlights (CAVEAT: this information is truly of a preview nature - it should not be considered as final until the product is generally available):&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;A further exploitation of 64-bit addressing should dramatically increase the number of threads that can be concurrently active in a DB2 subsystem.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;DB2 X is expected to reduce the CPU consumption of a typical DB2 workload  by 5-10% as compared to a DB2 Version 9 environment.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Native SQL stored procedures might get a performance boost of up to 10-20% versus a Version 9 environment.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;LOBs (large objects) that can fit onto a page will be in-lined in a base table versus being physically stored in a separate LOB  tablespace.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Dynamic statement caching will be more effective for SQL statements that contain literal values versus host variables.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;There will be a conversion path available for changing simple, segmented, and "classic" partitioned tablespaces to universal tablespaces.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;RUNSTATS will provide an "auto stats" option.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Temporal data support will enable DB2 X to be significantly more useful in the management of data that has "effective" dates (e.g., a change to an insurance policy will become effective on such-and-such a date) and/or which is updated of deleted at some time following initial insert into a table (DB2 will maintain a history of such data changes).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Building of a tablespace compression dictionary will not require a utility execution.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;DB2 X will enable data-masking to be specified at the column level.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Private protocol will go away (DRDA is much better anyway), and so will the ability to bind a DBRM directly into a plan (these should be bound into packages anyway).&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Rick Bowers has his priorities in order -&lt;/span&gt; IBM's Director of DB2 for z/OS development stated repeatedly during his "trends and directions" presentation that "it's all about the customer." If you're a DB2 user, Rick's in your corner. A few of his comments during the session:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Enhancing the capabilities of DB2 for z/OS in data warehouse environments is a big priority for Rick's team.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;Migration of DB2 for z/OS-using organizations to DB2 9 is proceeding apace.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;100% of the top 100 DB2 for z/OS-using organizations are using DB2 Version 8 or beyond, as are 99+% of the top 200.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Got mashups?  They're easier than ever now -&lt;/span&gt; IBM gave us blog-folk a preview of two new products that will be formally announced: Version 2 of the IBM Mashup Center, and Cognos 8 Mashup Service. The latter makes it very easy to use Cognos-generated report data in a mashup application, and the former very much simplifies creation, cataloging, discovery, and reuse of mashups (mashups provide a quick and convenient means of combining data from two or more sources, either external or internal to an organization -  example sources could be a sales performance report and a CRM system). With these new products (and they don't have to be used together),  if you have existing  data sources (internal and/or external) you can combine data into useful new representations in very little time and at very little cost. The GUI interface of the Mashup Center is very intuitive (program development  skill is not a prerequisite for productive use of the product), and the product's flexibility is impressive: sources can include MQ queues and RSS feeds (among other things - including, of course, Cognos 8 reports via the Cognos 8 Mashup Service), and you can implement  security controls that will govern the use of various mashups. Cool!&lt;br /&gt;&lt;br /&gt;That's the  wrap-up of my day-one experience at IOD. I'll post day two information tomorrow.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-5854850978039774814?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/5854850978039774814/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/10/ibm-iod-2009-day-1.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/5854850978039774814'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/5854850978039774814'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/10/ibm-iod-2009-day-1.html' title='IBM IOD 2009 - Day 1'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-171002816081901918</id><published>2009-10-13T08:51:00.000-07:00</published><updated>2009-10-13T08:53:32.762-07:00</updated><title type='text'>Wow - DB2 Data Sharing Comes to the AIX/Power Platform</title><content type='html'>&lt;span style="font-family:arial;"&gt;When my youngest child - now 8 years old - was younger still, I would read her a story at bedtime. One of her favorites was "Lilly's Purple Plastic Purse," by Kevin Henkes. Lilly was enthralled by her teacher, Mr. Slinger, and expressed her admiration for him pithily: "'Wow,' said Lilly. That was just about all she could say. 'Wow.'"&lt;br /&gt;&lt;br /&gt;That pretty much sums up my reaction to IBM's recent announcement of &lt;a href="http://www-03.ibm.com/press/us/en/pressrelease/28593.wss"&gt;DB2 pureScale&lt;/a&gt;, which essentially brings mainframe DB2 data sharing technology to IBM's Power Systems platform running the AIX operating system: "Wow."&lt;br /&gt;&lt;br /&gt;I got involved with DB2 for z/OS data sharing in the mid-1990s, while DB2 Version 4 (in which the feature was delivered) was still in the beta-test phase (I was in IBM's DB2 National Technical Support group at the time). I remember being pretty excited about shared-data architecture (in which multiple DB2 systems share concurrent read/write access to a database stored on shared disk volumes) and the potential for the solution to meet formerly unattainable objectives in terms of workload growth and database uptime. Sure enough, potential became reality, and DB2 for z/OS data sharing on the IBM Parallel Sysplex mainframe cluster became (and still is) the gold standard for enterprise data-serving scalability and availability. It was a huge jump forward, capability-wise, for DB2 on the mainframe platform.&lt;br /&gt;&lt;br /&gt;Now here we are in 2009, and DB2 for AIX has taken that big leap forward. pureScale doesn't just meet the shared-data competition in the UNIX marketplace - it changes the game. It will deliver levels of scalability and availability that simply were not possible before. How? Simple: it utilizes the same centralized shared-memory approach to global lock management and data coherency that has worked wonders for organizations that run DB2 for z/OS in data sharing mode. Here's the deal: if you're going to give multiple data servers read/write access to one database, you have a couple of choices when it comes to keeping the different data servers from trashing the consistency of said database: you can have a node directly communicate with all the other nodes regarding data rows that it's changing and data that it has cached locally, or you can go with a centralized approach, in which a data server node posts global lock and global buffer pool information to structures residing in devices that provide a shared-memory resource to the group (and here the term "global" refers to lock and buffer pool information that a node has to make known to other nodes so as to preserve data integrity). The problem with the former solution (one node directly communicates global lock and page cache information to others) is that it doesn't scale well - go beyond 4 nodes or so, and the increase in overhead largely negates the processing capacity of an added node.&lt;br /&gt;&lt;br /&gt;DB2 for z/OS data sharing, as people familiar with the technology know, was implemented with the centralized approach. The structures are known as the lock structure, the group buffer pools, and the shared communications area (the latter used to keep member nodes apprised of database objects in an exception state), and the shared-memory devices are called coupling facilities (originally external devices that are increasingly implemented as logical partitions within mainframe servers). The lock structure functions, in part, as a "bulletin board," to which nodes can post global lock information - and at which nodes can access global lock information - in microseconds (the lock structure also stores information about currently held data-changing locks, which helps to speed recovery in the event of a node failure). Members of a DB2 for z/OS data sharing group use the group buffer pools in the coupling facilities to "register interest" in database pages cached locally, so that they can be informed when a locally cached page has been changed by another member (changed pages are written to group buffer pools as part of commit processing, and they can be accessed there by other members in MUCH less time than a retrieval from disk - even from disk controller cache - would require).&lt;br /&gt;&lt;br /&gt;This centralized global lock and global page cache mechanism has scaled up very effectively, and I mean in the real world, not just in a demo setting: at the company for which I worked when I was on the user side of the DB2 community, we had a 9-way mainframe DB2 data sharing group that handled a huge workload with very little overhead. I know of a 15-member DB2 for z/OS data sharing group at a large bank, and there could be systems out there with more nodes than that. Centralized management of global lock and page cache information has also paid dividends in the area of availability: the impact of a node failure is minimized, and restart of a failed DB2 member is accelerated. At my former workplace, a member of a production DB2 for z/OS data sharing group terminated abnormally in the middle of the day. It was automatically - and quickly - restarted, and the failure event did not impact our clients (the application workload continued to run on the surviving nodes while the failed member was restarted).&lt;br /&gt;&lt;br /&gt;With pureScale, DB2 for AIX users can realize these same shared-data scalability and availability advantages. DB2 pureScale basically provides the functionality that coupling facilities do in a mainframe DB2 data sharing group, housing the global lock, group buffer pool, and shared communications area structures in a super-high-performance shared memory resource. Member systems running AIX and DB2 9.8 (the DB2 release that enables participation in a multi-node shared-data system) connect to the pureScale servers, and the increase in overall processing power as nodes are added is almost linear. In other words, the overhead of concurrent read/write access to the shared database increases only slightly as nodes are added to the group (IBM has demonstrated pureScale configurations with scores of nodes). On the availability front, there's automatic and fast (seconds) restart of a DB2 member in the event of a failure, fast (seconds) release of locks on rows that were being changed by a member DB2 at the time of a failure, and automatic routing of incoming transactions to other members during restart of a failed DB2 member.&lt;br /&gt;&lt;br /&gt;Allow me to restate the point for emphasis: the DB2 for z/OS data sharing/parallel sysplex architecture has proven itself for nearly 15 years in tremendously demanding conditions, in terms of throughput and availability requirements, at sites all over the world. Developers at IBM's labs in Toronto (DB2 for Linux/UNIX/Windows) and Austin (Power Systems) have worked for years to bring that architecture to the DB2/AIX/Power platform, leveraging the advanced technology originally brought to the market by their colleagues in San Jose (DB2 for z/OS) and Poughkeepsie (System z). DB2 pureScale is the result of those efforts. As DB2 for z/OS data sharing took the already-high scalability and availability standards of the mainframe DB2 platform and raised them still higher, so pureScale will do for DB2 on AIX/Power, the platform that already sets the standard for UNIX system reliability.&lt;br /&gt;&lt;br /&gt;There are other great parallels between DB2 for z/OS data sharing and DB2 pureScale: both are application-transparent, both provide system-managed workload balancing, and both allow for very granular increases in system processing capacity.&lt;br /&gt;&lt;br /&gt;There's plenty more to report about pureScale, and I'll try to provide additional information in future posts. For now, I'll highlight a few items that I hope will be of interest to you:&lt;br /&gt;&lt;/span&gt; &lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;The DB2 for LUW data partitioning feature (DPF) isn't going anywhere.&lt;/span&gt; Particularly for data warehouse/business intelligence systems, the shared-nothing clustering architecture implemented via DPF is the best scale-out solution.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Transaction log record sequencing is there.&lt;/span&gt; As in a DB2 for z/OS data sharing group, each member in a DB2 pureScale configuration logs changes made by SQL statements executed on that member. If a table that has been changed by SQL statements executing on multiple members has to be recovered, a log record sequence numbering mechanism and read access for any given member to all other members' log files ensures that the roll-forward operation (following restoration of a backup) will apply re-do records in the correct order.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;pureScale licensing is very flexible.&lt;/span&gt; If you need to temporarily add processing capacity to a DB2 pureScale configuration to handle a workload surge, you pay for the additional peak capacity only when you use it.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Talk to your tool vendors about pureScale.&lt;/span&gt; IBM has been working for some time with several vendors of DB2 for LUW tools, to help them get ready for pureScale.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;span style="font-family:arial;"&gt;As I mentioned, there's more information to come. Stay tuned. This is big.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-171002816081901918?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/171002816081901918/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/10/wow-db2-data-sharing-comes-to-aixpower.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/171002816081901918'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/171002816081901918'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/10/wow-db2-data-sharing-comes-to-aixpower.html' title='Wow - DB2 Data Sharing Comes to the AIX/Power Platform'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-4290323391697385248</id><published>2009-10-07T11:46:00.000-07:00</published><updated>2009-10-07T21:05:43.796-07:00</updated><title type='text'>Thoughts on DB2 for z/OS BACKUP SYSTEM and RESTORE SYSTEM</title><content type='html'>&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;span style="font-family: arial;"&gt;Recently I worked with an organization that is planning an implementation of SAP's ERP application, with the associated database to be managed by DB2 9 for z/OS. This impending SAP installation was a major impetus for getting DB2 9 in-house, thanks largely to the significant enhancements delivered in that release for the BACKUP SYSTEM and RESTORE SYSTEM utilities. In this entry, I'll provide a brief overview of BACKUP SYSTEM and RESTORE SYSTEM, describe new features of these utilities in a DB2 9 environment, and pass on some related information of the "news you can use" variety.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;BACKUP SYSTEM and RESTORE SYSTEM are prime examples of user-driven advances with respect to DB2 functionality. Near the end of the 1990s, several large companies using SAP with DB2 for z/OS met with IBM and SAP to press for a solution to a recovery-preparedness challenge. For these organizations, the traditional means of DB2 data backup - the COPY utility - was not satisfactory, owing to the fact that an SAP-DB2 database could contain tens of thousands of objects. System-wide, disk volume-level backups could be efficiently created using the FlashCopy technology of IBM disk subsystems (other disk storage vendors such as EMC and HDS offer a similar capability), but this approach had two significant drawbacks: 1) it was an outside-of-DB2 process, and 2) recovery of a database (using a system-wide volume-level backup) to a consistent state depended on the existence of system-wide quiesce points established via the DB2 commands -SET LOG SUSPEND and -SET LOG RESUME (the former of these commands was quite disruptive in a high-volume OLTP application environment). The SAP- and DB2-using companies wanted a system-wide backup solution that would take advantage of FlashCopy (or equivalent) technology, be executable through DB2, and allow for recovery with consistency to a user-specified point in time.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family: arial;"&gt;IBM's response to this request was delivered with DB2 for z/OS Version 8, in the form of the aforementioned BACKUP SYSTEM and RESTORE system utilities. One of the big advantages of this new solution was the elimination of the formerly required system-wide quiesce points for recovery of a database to a consistent state: with BACKUP SYSTEM and RESTORE SYSTEM, the database could be recovered to &lt;span style="font-style: italic;"&gt;any&lt;/span&gt; user-specified prior point in time (prior to currency - more on that momentarily), as DB2 would use information in the recovery log to back out any data-changing units of work that were in-flight at the designated point in time (underscoring the importance of this being a DB2-managed backup and recovery process). DB2 9 delivered some important enhancements to the functionality of these utilities, as described below:&lt;/span&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;Object-level recovery from a system-level backup -&lt;/span&gt; With DB2 Version 8, it was only possible to perform a system-level recovery using a system-level backup. In a DB2 9 environment, the RECOVER utility can be used to recover an individual object (e.g., a tablespace) or a set of objects using a system-level backup made with the BACKUP SYSTEM utility.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;Recover to currency using RESTORE SYSTEM -&lt;/span&gt; Before DB2 9, recovery using RESTORE SYSTEM had to be to a point in time prior to the end of the log. Now, a recovery to currency can be performed by specifying SYSPITR=FFFFFFFFFFFF on the control statement of the DSNJU003 utility (change log inventory) that is executed prior to running RESTORE SYSTEM.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;Support for incremental FlashCopy -&lt;/span&gt; Initially, FlashCopy - as invoked through the BACKUP SYSTEM utility - will create a full copy of the source volumes on the designated target volumes. Logically speaking, this copy operation is almost instantaneous. The physical copy operation takes more time to complete. If data is written to a source volume while the physical copy on the target is being made, the storage subsystem will check to see if the to-be-changed source-volume track has been copied to the target. If it hasn't, the source track will be copied to the target volume before being changed by the pending write operation. After the initial full copy has been completed (in the physical sense), subsequent copies can be incremental, with only the tracks changed since the last backup being copied from source to target. Thus, the workload on the I/O subsystem is reduced.&lt;/span&gt;&lt;span style="font-family: arial;"&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;&lt;span style="font-weight: bold;"&gt;Support for backup to tape -&lt;/span&gt; With DB2 Version 8, getting a disk copy generated via BACKUP SYSTEM to tape was a manual process. With DB2 9, BACKUP SYSTEM can be used to copy a source backup to tape as it's being written to the target volumes. Alternatively, a backup can be written to tape some time after the physical copy to the target volumes has completed.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;Now, the BACKUP SYSTEM and RESTORE SYSTEM utilities are very nice pieces of DB2 functionality, but there are some things you should think about before using them. First of all, even though DB2 9 enables object-level recovery from a system-level backup, you SHOULD NOT expect BACKUP SYSTEM to eliminate the need to use the COPY utility - this for two reasons:&lt;br /&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;You'll need to use COPY to establish a new "recovery base" for an object following a LOAD REPLACE (or a LOAD RESUME with LOG NO) or an offline REORG with LOG NO (an inline image copy is required if you run an online REORG job).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Object-level recovery from a system-level backup is currently not possible for a data set that has been moved since the system-level backup was created (as would be the case for most REORG and LOAD REPLACE operations).  This restriction will be lifted in the near future, but it's there today.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-family: arial;"&gt;Second, the  use of BACKUP SYSTEM and RESTORE SYSTEM requires that the active log and BSDS data sets  have an ICF catalog that is separate from the one used for the DB2 catalog and directory and user/application data sets.  In other words, it's not sufficient that these two categories of DB2 data sets (active log/BSDS and catalog/directory/application) use different aliases that point to the same ICF catalog.  The ICF catalogs themselves have to be different (and  the two categories of DB2 data sets have to be in different SMS storage groups).  If your BSDS and active log data sets are currently in the same ICF catalog as the DB2 catalog and directory and application  objects (i.e., application tablespaces and indexes), you'll need to separate them before using BACKUP SYSTEM and RESTORE SYSTEM.  Moving the BSDS and active log data sets to a separate ICF catalog basically involves doing the following:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Define the new ICF catalog and the new high-level qualifier for the BSDS/active logs.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Define the new BSDS and the active log data sets in it.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;WHILE DB2 IS DOWN, copy the active log and BSDS data sets to the new data sets (the ones in the separate ICF catalog). The best way to do this is probably to use DFDSS to copy the datasets with the RENAMEU parm specified to change the high-level qualifier.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Then use the DB2 change log inventory utility to fix up the BSDS with the correct log ranges for the current active log (the non-reusable one).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;For the rest of the log data sets, you can use the change log inventory utility to delete the old ones (with the old high-level qualifier) from the BSDS and and add the new ones (using the NEWLOG statement) without specifying log ranges.  This saves a bunch of time and effort, and you needn't worry about DB2 being able to find log data sets with records in a certain range - as long as the other logs have been archived, DB2 can find the ranges in the archive log.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Note: you do NOT have to use the change log inventory utility with the NEWCAT statement to change the VCAT name in the BSDS, because that VCAT name is the one used for the catalog and directory, and the one you're changing is for the BSDS and active log data sets.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family: arial;"&gt;Then start DB2.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family: arial;"&gt;Finally, a note on the frequency of BACKUP SYSTEM execution. Organizations that are using the BACKUP SYSTEM utility today are generally running it twice daily for their production systems.  IBM recommends keeping at least two system-level backups on disk, but not all  organizations can afford to allocate the amount of disk resources required for this.  You might end up keeping the most recent backup on disk, with previous backups going to tape.&lt;br /&gt;&lt;br /&gt;I encourage you to check out BACKUP SYSTEM and RESTORE SYSTEM. If you have a whole lot of objects in your DB2 database, these utilities could make your backup and recovery processes a whole lot simpler.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-4290323391697385248?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/4290323391697385248/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/10/thoughts-on-db2-for-zos-backup-system.html#comment-form' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/4290323391697385248'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/4290323391697385248'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/10/thoughts-on-db2-for-zos-backup-system.html' title='Thoughts on DB2 for z/OS BACKUP SYSTEM and RESTORE SYSTEM'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-6376384244687513480</id><published>2009-09-30T14:32:00.000-07:00</published><updated>2011-09-20T08:38:02.632-07:00</updated><title type='text'>DB2-Managed Disk Space Allocation - An Overlooked Gem?</title><content type='html'>&lt;span style="font-family:arial;"&gt;Last week I was again teaching a DB2 for z/OS database administration course, this time for the half of the client's DBA team that minded the store while their colleagues attended the prior week's class. As I mentioned in &lt;a href="http://catterallconsulting.blogspot.com/2009/09/of-db2-for-zos-indexes-partitioneduh.html"&gt;my last blog entry&lt;/a&gt;, the organization had recently completed the migration of their production system to DB2 for z/OS Version 8, so we spent a good bit of class time discussing new features and functions delivered in that release of the product. In such a situation, it's always interesting to see what really grabs the attention of class participants. This time, a top attention-grabber was the automatic data set extent-size management capability introduced with DB2 V8 and enabled by default with DB2 9 for z/OS (more on this DB2 9 change momentarily). Facial expressions during our talks on this topic communicated the unspoken question, "DB2 can &lt;span style="font-style: italic;"&gt;do&lt;/span&gt; that?"&lt;br /&gt;&lt;br /&gt;It does seem almost too good to be true, especially to a DBA who has long spent more time than desired in determining appropriate primary and - especially - secondary space allocation quantities for DB2-managed (i.e., STOGROUP-defined) tablespace and index data sets. As you're probably aware, a non-partitioned tablespace (this will typically be a segmented tablespace) can grow to 64 GB in size by occupying thirty two 2 GB data sets; however, the first 2 GB data set has to fill up before DB2 can go to a second (and the second has to fill up before DB2 can go to a third, and so on), and DB2 won't fill up that 2 GB data set if it runs into the data set extent limit (255 extents). The same applies to non-partitioned indexes, except that the data set size is not set at 2 GB - rather, it's determined by the PIECESIZE specification on a CREATE INDEX or ALTER INDEX statement. In either case, you want to set the primary and secondary space allocation quantities (PRIQTY and SECQTY on the CREATE or ALTER statement for the object) so that the object can reach the data set size limit (and thus expand to multiple data sets) without first running into the aforementioned limit on the number of extents for a data set. A partition of a partitioned tablespace or partitioned index isn't going to spread across multiple data sets, but you still want to be able to reach the maximum data set size (specified via DSSIZE on the CREATE TABLESPACE statement) before hitting the data set extent limit.&lt;br /&gt;&lt;br /&gt;What should your PRIQTY and SECQTY amounts be? Too small, and you might hit the extent limit before reaching the desired maximum data set size. Too large, and you might end up with a good bit of wasted (allocated and unused) space in your tablespaces and indexes. Even if reaching a maximum data set size is not an issue (and it won't be for smaller tablespaces and indexes), you still want to strike a balance between too many extents and too much unused space. Multiply this by thousands of objects in a database, and you've got a big chunk of work on your hands.&lt;br /&gt;&lt;br /&gt;Enter the "sliding scale" secondary space allocation algorithm introduced with DB2 for z/OS V8. Here's how it works: First, set three ZPARM parameters as follows (all are on the DSNTIP7 panel of the DB2 installation CLIST):&lt;br /&gt;&lt;/span&gt; &lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;TSQTY (aka "table space allocation"): &lt;span style="font-weight: bold;"&gt;0&lt;/span&gt; (this is the default value)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;IXQTY (aka "index space allocation"): &lt;span style="font-weight: bold;"&gt;0&lt;/span&gt; (this is the default value)&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;MGEXTSZ (aka "optimize extent sizing"): &lt;span style="font-weight: bold;"&gt;YES&lt;/span&gt; (for DB2 V8 the default value is NO, and for DB2 V9 it's YES - this is what I meant by my "enabled by default" comment regarding DB2 V9 in the opening paragraph of this blog entry)&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt; &lt;span style="font-family:arial;"&gt;Then - and here's the really good part - you alter an existing tablespace or index with PRIQTY -1 and SECQTY -1, and voila! DB2 will manage primary and secondary allocation sizes for you. Specifically, the primary allocation for the tablespace or index will be 1 cylinder (thanks to your having specified 0 for the TSQTY and IXQTY ZPARMs, as recommended above), and &lt;span style="font-style: italic;"&gt;the initial secondary space allocation&lt;/span&gt; will also be 1 cylinder (note that the primary space allocation for a LOB tablespace in this scenario would be 10 cylinders versus 1). After that, subsequent extents - up to the 127th - for the data set will be increasingly larger, with the sizes determined by a sliding-scale algorithm used by DB2. The size of extents beyond the 127th will be fixed, depending on the initial size of the data set: 127 cylinders for data set sizes up to 16 GB, and 559 cylinders for 32 GB and 64 GB data sets. For new tablespaces and indexes, DB2-managed primary and secondary space allocation sizing is enabled by simply not including the PRIQTY and SECQTY clauses in the CREATE TABLESPACE or CREATE INDEX statement.&lt;br /&gt;&lt;br /&gt;Pretty great, huh? No muss, no fuss with regard to space allocation for DB2-managed data sets. Best of all, it works. When you let DB2 handle data set extent sizing, it is highly unlikely that you'll hit the data set extent limit before reaching the maximum data set size, and the start-small-and-slowly-increase approach to secondary allocation requests keeps wasted space to a minimum. What I find interesting is the fact that many DB2 people don't know about this great DBA labor-saving device. In a recent thread on the DB2-L discussion list, Les Pendlebury-Bowe, a DB2 expert based in the UK, referred to DB2-managed data set space allocation as "a real success story that never seems to get much press." In that same thread, DB2 jocks Myron Miller (a Florida-based consultant) and Roger Hecq (with financial firm UBS in Connecticut) added their endorsements of DB2-managed space allocation. Myron noted that with a specification of PRIQTY -1 and SECQTY -1, "I never have to even worry about the number of rows at any time in the tablespace" (and keep in mind that the -1 values are for ALTER TABLESPACE or ALTER INDEX - as noted previously, just don't specify PRIQTY and SECQTY on CREATE statements to enable DB2 management of space allocation for new objects). Roger stated that he's been pleased to "let DB2 do all the work" related to disk space management, and that his organization has "not had any issues" with their use of this DB2 capability.&lt;br /&gt;&lt;br /&gt;[By the way, if you are not a &lt;a href="http://www.idug.org/cgi-bin/wa?A0=DB2-L"&gt;DB2-L&lt;/a&gt; subscriber, you should be. It's a great - and free - DB2 technical resource.]&lt;br /&gt;&lt;br /&gt;There you have it - a real gem of a DB2 feature that has been overlooked by plenty of DB2 people. It's easy to use, and people who have implemented DB2 management of data set space allocation like it a lot. Give it a go.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-6376384244687513480?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/6376384244687513480/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/09/db2-managed-disk-space-allocation.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6376384244687513480'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6376384244687513480'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/09/db2-managed-disk-space-allocation.html' title='DB2-Managed Disk Space Allocation - An Overlooked Gem?'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-5158345389107322250</id><published>2009-09-20T18:55:00.000-07:00</published><updated>2011-09-20T08:35:35.930-07:00</updated><title type='text'>Of DB2 for z/OS Indexes, PartitioneDUH and Otherwise</title><content type='html'>&lt;span style="font-family:arial;"&gt;Last week I taught a DB2 for z/OS database administration class for an organization that fairly recently completed the migration of their production environment to DB2 V8 in Conversion Mode.  That being the case, a good bit of class  time was spent in discussion of  new features and functions introduced with DB2 V8. One topic of particular interest to the class attendees was table-controlled partitioning. Our conversations on this subject became rather amusing, for reasons I'll describe momentarily.  First, a little background.&lt;br /&gt;&lt;br /&gt;For years and years, there was only one way to partition a mainframe DB2 table: you created a tablespace with the NUMPARTS clause (through which you specified the number of partitions into which a table's rows would be divided), created a table in the tablespace, and created on the table a partitioning index. The index controlled the partitioning of the data, as it was at this level that you defined the partitioning key (comprised of one or more columns) and the limit values for each partition (e.g., with a partitioning key of CUSTOMER_NUMBER, partition 1 might hold rows with a CUSTOMER_NUMBER value of '05000000' or less, while partition 2 would hold rows with CUSTOMER_NUMBER values larger than '0500000' and less than  '1000000', and so on).&lt;br /&gt;&lt;br /&gt;The table-controlled partitioning feature of DB2 V8 was a big improvement over index-controlled partitioning, largely because it provided users with the ability to partition data with  one key while clustering data within partitions by way of another key. With index-controlled partitioning, the partitioning index was also the clustering index, like it or not. With table-controlled partitioning, assignment of rows to partitions based on a key value is a table-specified thing, and clustering  is an index-defined thing. So, if you want rows to be divided among partitions by date (perhaps  one month per  partition), you specify that on the CREATE TABLE statement, and if you want rows within a partition (i.e., for a given month) to be ordered by CUSTOMER_NUMBER, you accomplish that  by defining an index, with the CLUSTER clause, on the CUSTOMER_NUMBER column of the table. That kind of set-up is often ideal for a data warehouse database, allowing a query searching across several months to be parallelized by DB2, while providing good locality of reference for the split queries within partitions if a certain customer number or set of customer numbers is targeted.&lt;br /&gt;&lt;br /&gt;Table-controlled partitioning also delivered very useful features such as dynamic addition of a partition to a table, and a ROTATE PARTITION FIRST TO LAST option of ALTER TABLE that enables the use of a fixed number of partitions for, say, a rolling 52 weeks of data (every week, data in the "oldest week" partition is purged and archived, and the empty partition is adjusted to receive data for the upcoming week). Still, with all this goodness there comes the task of learning to speak of indexes on table-controlled partitioned tables in a new way. Formerly, a partitioned tablespace had one partitioning index - that being the one by which the partitioning key and partition limit key values were specified. The partitioning  index was also the only one on the table that could itself be partitioned (i.e., with index entries spread across index partitions that matched up with the corresponding partitions of the tablespace). With a table-controlled partitioned table, any index that starts with the partitioning key column (or columns) is referred to as a partitioning index, and any index that is defined with the PARTITIONED clause (whether or not it starts with the partitioning key) will have its entries divided among index partitions that correspond to the table's partitions.&lt;br /&gt;&lt;br /&gt;Thus, the amusing (to me) discussions in the aforementioned class about indexes  defined on table-controlled partitioned tables: to make ourselves clear, we very strongly emphasized the last consonant sound of the type of index about which we were speaking, as in:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;DBA:&lt;/span&gt; Is a partitioninGUH index necessarily partitioneDUH?&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Me:&lt;/span&gt; No. If a partitioninGUH index is defined without the partitioneDUH clause, it will not be partitioneDUH.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;DBA: &lt;/span&gt;Why would someone create a partitioninGUH index without making it partitioneDUH?&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Me:&lt;/span&gt; Good question.  I'd think that you'd always want a partitioninGUH index to be partitioneDUH.&lt;br /&gt;&lt;br /&gt;This kind of exchange would crack me up (I'm easily amused). Anyway, the discussions were much appreciated on my part, as an interactive class is a fun class for me. It's also rewarding to help people explore  the possible uses of new DB2 features (think about the new flexibility of partitioning in a DB2 V8 or V9 environment, especially in light of the fact that you can now specify up to 4096 partitions for a single table - talk about slicing and dicing). Note, if you're on or moving to DB2 for z/OS V9, that all of this partitioninGUH and partitioneDUH talk is relevant to the new partition-by-range universal tablespaces, but not to partition-by-growth universal tablespaces, as there is no sense of a partitioning key with respect to the latter. I wrote about partition-by-growth tablespaces &lt;a href="http://catterallconsulting.blogspot.com/2009/02/db2-partition-by-growth-tables-very.html"&gt;in a previous entry&lt;/a&gt;, and you can check that out if you're interested in learning more about this new type of database object.&lt;br /&gt;&lt;br /&gt;That's all for now.  Thanks for stopping by the bloGUH.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-5158345389107322250?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/5158345389107322250/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/09/of-db2-for-zos-indexes-partitioneduh.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/5158345389107322250'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/5158345389107322250'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/09/of-db2-for-zos-indexes-partitioneduh.html' title='Of DB2 for z/OS Indexes, PartitioneDUH and Otherwise'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-2490586422395832723</id><published>2009-09-07T15:04:00.000-07:00</published><updated>2011-09-20T08:33:15.570-07:00</updated><title type='text'>OLTP and BI on the Same DB2 for z/OS System (Part 2)</title><content type='html'>&lt;span style="font-family:arial;"&gt;A few days ago, I posted &lt;a href="http://catterallconsulting.blogspot.com/2009/08/oltp-and-bi-on-same-db2-for-zos-system.html"&gt;part one&lt;/a&gt; of a 2-part entry on the subject of using the same DB2 for z/OS system (meaning a single logical DB2 system image - it could be a multi-subsystem DB2 data sharing group on a parallel sysplex) for both OLTP and business intelligence (BI) workloads. In that entry I focused on minimizing OLTP-BI workload contention on a number of levels: the disk subsystem, the DB2 buffer pools, DB2 locks, and CPU. With regard to that last contention category, I mentioned that "&lt;/span&gt;&lt;span style="font-family:arial;"&gt;an important aspect of managing CPU contention between OLTP and BI applications running on the same system is the management of DB2 query parallelization, particularly as it pertains to the BI queries." In this part 2 post, I'll expand on that statement.&lt;br /&gt;&lt;br /&gt;DB2 query parallelization is a very good thing when it comes to improving the performance of queries that involve scanning large numbers of data and/or index pages. In case you're not familiar with this DB2 feature or you need a little refresher, query parallelism has been around for quite some time, having been delivered in the mid-1990s with DB2 for z/OS Version 4. The technology enables DB2 to take a particular query and - on determining that parallelization would be beneficial for run-time reduction - split it into several tasks that can be executed concurrently on different engines within a mainframe server (or even on several servers, if we're talking about sysplex query parallelism in a DB2 data sharing group). Depending on the nature of the query, DB2 might start returning rows to the requester as they are qualified by the split queries, or the split-query result sets may be consolidated  before any rows are returned (as when a result set sort is required or an aggregate function such as SUM is utilized). The larger the number of pieces into which a query is split, the greater the potential is for better response time.&lt;br /&gt;&lt;br /&gt;Generally speaking, parallelized queries split along tablespace partition lines, so greater degrees of parallelism can be expected when target tablespaces have a lot of partitions (assuming that qualifying rows will come from multiple partitions) and the mainframe server has a pretty good number of fast CPUs (the number of CPUs is, of course, NOT an upper bound on the degree of query parallelization, as the split queries are likely to be I/O bound and the CPU portion of these can be interleaved on one engine as I/O wait events occur). So, into how many pieces might DB2 split a query? The answer to that question depends in large part on the setting of two DB2 ZPARMs (i.e., subsystem-level parameters): CDSSRDEF and PARAMDEG.&lt;br /&gt;&lt;br /&gt;CDSSRDEF specifies the default value of the CURRENT DEGREE special register for a DB2 subsystem. This value will be 1 if you don't change it, and that means that a DYNAMIC query will not be parallelized by DB2 unless it is preceded by the SQL statement SET CURRENT DEGREE = 'ANY' (a static SELECT statement will be a  candidate for parallelization if it's associated with a package bound with the DEGREE(ANY) specification). This default value for CDSSRDEF is the right one for many - and perhaps most - situations because it gives you statement-level control over the use of parallelization by DB2 for dynamic queries (in a BI environment queries tend to be dynamic). Making ALL dynamic queries candidates for parallelization by setting CDSSRDEF to ANY would  increase CPU overhead for your BI workload. Why? Because  DB2 would have  to consume extra cycles just to determine whether or not parallelization would be beneficial for  each and every dynamic query. When the determinations is "no, it would not" (as would likely be the case for a query targeting a non-partitioned table or for a query that would retrieve rows from one partition of a partitioned table), that extra CPU consumption in query optimization will not yield a benefit in terms of query execution time.&lt;br /&gt;&lt;br /&gt;That said, sometimes a specification of ANY for CDSSRDEF is necessary for dynamic query parallelization, because the BI queries may be generated by PC-based end-user tools that do not allow for insertion of a SET CURRENT DEGREE = 'ANY' statement. If you have a DB2 data sharing group and your BI queries and your OLTP transactions run on different members, you can have CDSSRDEF = ANY on the BI-supporting subsystem (or subsystems), and CDSSRDEF = 1 on the OLTP-supporting DB2 members. If you have a single DB2 subsystem on which you run OLTP and BI work, what should you do if SET CURRENT DEGREE = 'ANY' is not an option for the BI queries? I'd lean towards setting CDSSRDEF to ANY, and then limiting the degree of parallelization for queries through the PARAMDEG specification in ZPARM. The default value of this parameter is 0, and that means that DB2 will determine the degree of parallelization for a query that it decides to split. I like that default because I feel that DB2 does this well and z/OS is very good at managing a complex and dynamic workload (as when it throttles down the processing resources allocated to a parallelized query in order to accommodate new work entering the system). If,  however,   an OLTP workload were  running on the same DB2 subsystem as the  BI workload, I'd want to put a relatively low upper  bound on the degree of parallelization for dynamic queries, the better to deliver consistent response times for the OLTP transactions. I might go for something as low as 3 or 4 for PARAMDEG, so that I'd  get some significant (if not huge) run-time reduction for some of the BI queries while limiting variations with respect to OLTP transaction execution times.&lt;br /&gt;&lt;br /&gt;In addition to placing an upper bound on query parallelization when running OLTP and BI work on the same DB2 subsystem, you might want to think about limiting query parallelization to only a portion of the dynamic queries that run on your system. The DB2 for z/OS resource limit facility (RLF) provides a way to do this. What you can do is create a resource limit specification table (RLST) in which you put  one or more rows with a value of '4' in the RLFFUNC column (this disables query parallelism) and the names of packages for which you DO NOT want associated dynamic queries to be parallelized in the RLFPKG column (in the LUNAME column of this RLST, you can have a blank value for the local location, or PUBLIC for TCP/IP-connected remote requesters). In addition (or instead), you could disable query parallelism by authorization ID using the AUTHID column of your RLST. Then you can set CDSSRDEF to ANY and know that dynamic queries associated with packages and/or auth IDs specified in your RLST rows will not be candidates for parallelization.&lt;br /&gt;&lt;br /&gt;In summary: take advantage of DB2 query parallelization for your BI queries, but use it conservatively when you have OLTP transactions running on the same subsystem.&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt; &lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-2490586422395832723?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/2490586422395832723/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/09/oltp-and-bi-on-same-db2-for-zos-system.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2490586422395832723'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2490586422395832723'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/09/oltp-and-bi-on-same-db2-for-zos-system.html' title='OLTP and BI on the Same DB2 for z/OS System (Part 2)'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-6676360253329026637</id><published>2009-08-26T19:14:00.000-07:00</published><updated>2009-08-26T19:18:48.730-07:00</updated><title type='text'>OLTP and BI on the Same DB2 for z/OS System (Part 1)</title><content type='html'>&lt;span style="font-family: arial;"&gt;I recently had the opportunity to work on an interesting challenge, involving the use of one DB2 system for both an online transaction processing (OLTP) and a business intelligence (BI) workload. Now, the system in question happened to be a DB2 data sharing group running on a parallel sysplex, and that opens up some interesting possibilities, but before getting to that I'd like to deal with the overall issue: can you successfully support OLTP and BI applications using the same mainframe DB2 system?&lt;br /&gt;&lt;br /&gt;Some people would be pretty quick to answer that question with a negative, and the chief concern of those folks might be the unfavorable impact of BI queries on the performance of online transactions. Is that a valid concern? It certainly can be, depending on the extent to which one can or cannot eliminate areas of contention between the two workloads. Put another way, if OLTP- and BI-related SQL statements can execute without interfering with one another, use of one DB2 system for both workloads is a viable option. In this post, I'm going to briefly describe some sources of single-system OLTP and BI workload contention, and ways in which such contention can be either eliminated or at least largely mitigated. In a Part 2 entry that will follow in a few days, I'll zero in on the challenge of managing query parallelization in a single OLTP- and BI-supporting DB2 environment. On now to potential sources of OLTP-BI interference:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The disk subsystem.&lt;/span&gt; This, of course, has very much to do with whether or not OLTP- and BI-related SQL statements are targeting the same DB2 tables. You might be thinking, "Well of course they wouldn't - you'd have one set of tables for OLTP programs and another set for the query and reporting applications." I'll agree that this is the PREFERRED scenario versus having OLTP and BI users access the same tables. I'd certainly want to place the OLTP database objects and the BI objects on two different sets of disk volumes. That said, some mainframe DB2 shops do, in fact, use the same tables for OLTP and for query/reporting. It's not that they're crazy. They may have had to give query and reporting capabilities to a group of users in a quick and inexpensive way, and allowing online access to the OLTP tables might have been seen as the best near-term alternative. Working in these folks' favor is the fact that disk-level contention, while definitely not a non-issue, is not as severe a performance concern as it was back in the 1990s, before people had gigabytes of cache memory on their disk control units and before DB2 and System z had 64-bit addressing (buffer pools can be MUCH larger in DB2 V8 and V9 subsystems versus prior releases). More DB2 buffer pool and storage controller cache hits mean fewer accesses to spinning disk, and that means less contention. Still, if you use the same tables for OLTP and BI, keep an eye (via your DB2 monitor) on wait time per DB2 synchronous read I/O. I like to see a figure that's below 10 milliseconds.&lt;br /&gt;&lt;br /&gt;Note that reduced disk contention is only one of the benefits associated with the use of separate tables for OLTP and BI applications. Physical table separation also provides an opportunity to make the BI tables more appropriate for a data warehouse environment, perhaps through database design changes (e.g., normalization/denormalization, or maybe a change to a dimensional design featuring so-called star schema table arrangements) and/or data transformations that might include replacement of codes with more user-meaningful data values.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;The DB2 buffer pools.&lt;/span&gt; As with disk-level contention, reducing contention for buffer pool resources depends largely on the use of separate database objects for OLTP and BI purposes. Different tablespaces and indexes can be assigned to different buffer pools. If OLTP and BI applications have to use the same database objects (and I would hope that this would be a temporary set-up on the way towards physical separation), you will be primarily focused on keeping BI-related prefetch activity from pushing table and index pages accessed by OLTP programs out of the shared buffer pools. To this end, you might want to adjust VPSEQT (the threshold for the percentage of buffer pool space that can hold pages brought in via prefetch) down somewhat from the default of 80 for shared buffer pools. If you use query parallelism, you might also want to alter VPPSEQT (similar in concept to VPSEQT, but for pages brought in via queries that are split and executed in a parallel fashion) down somewhat from the default of 50 for shared pools.&lt;br /&gt;&lt;br /&gt;In protecting OLTP transaction performance in a shared buffer pool environment, look to keep the rate of disk read I/Os at a reasonable level. I like it when the combined number of synchronous and asynchronous - i.e., prefetch - read I/Os is less than 100 per second. If the read I/O rate is high, you can generally reduce it by enlarging the size of the buffer pool in question (but don't over-burden server memory - the rate of demand paging from auxiliary storage, available via your z/OS monitor, should ideally be in the single digits or low double digits per second).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Locking.&lt;/span&gt; Once again, the best result is achieved through the use of separate tables for OLTP and BI. If dual-purpose tables are a necessity for a while, consider having the BI queries run with the UR (uncommitted read) isolation level (this can be accomplished by binding packages through which BI queries are executed with ISOLATION(UR), or by specifying WITH UR at the end of SELECT statements). UR results in essentially no locking. That, of course, means that queries could read uncommitted data changes that are subsequently rolled back because of an error situation encountered by a data-changing program, but this may not be a concern for BI users. BI queries often involve large-scale data aggregations, so a result obtained by a UR query is likely to be very close (if not identical) to one obtained with the default isolation level of cursor stability in effect, and very close is typically good enough for decision support applications.&lt;br /&gt;&lt;br /&gt;If you can use separate tables for BI applications, put these (and their associated indexes) in databases other than those used for the OLTP tables (and here I'm using the term "database" in the very specific DB2 for z/OS sense). Databases have associated with them control blocks called database descriptors, or DBDs, and the S-lock taken on a DBD by a dynamic SQL statement can conflict with data definition language statements (e.g., CREATE, ALTER, and DROP) targeting objects in the same database. If you have to use one set of tables for OLTP and BI, you can reduce DBD locks associated with dynamic SQL statements (and BI-related queries tend to be dynamic) by taking advantage of DB2's dynamic statement caching capability (dynamic statement caching can also provide you with a wealth of information about your dynamic SQL workload, this by way of the SQL statement EXPLAIN STMTCACHE ALL).&lt;br /&gt;&lt;br /&gt;Long-running dynamic queries can also interfere with the drain locking mechanism used by DB2 for z/OS utilities such as online REORG. A running SELECT statement will cause a read claim to be acquired and held by the issuing process, and that claim will be held until the process commits. In a BI environment, commits are typically taken - often automatically by a query tool - for each statement after it completes, but the read claim can be held for a long time in the case of a long-running query, and some BI queries can run for hours. If the same tables have to be used for OLTP and BI purposes, you might need to shut off query submission at some time (e.g., 7:00 PM), and then allow two or three hours for any long-running queries to complete. After that time you may need to cancel threads associated with still-executing BI queries so that utilities targeting dual-purpose tablespaces and indexes will be able to run unimpeded until BI activity starts up again at the beginning of the business day.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;CPU.&lt;/span&gt; Ideally, there will be enough CPU capacity on the DB2 system to satisfy the requirements of both the OLTP and BI workloads. If there is not enough CPU to go around, you can try to free some up through tuning the performance of OLTP- and/or BI-related SQL statements; otherwise, you may need to adjust z/OS Workload Manager (WLM) settings to favor one workload over the other, or perhaps to give priority to SQL statements that run quickly (weather OLTP- or BI-related) versus those that take longer to complete.&lt;br /&gt;&lt;br /&gt;Organizations running DB2 in data sharing mode on a parallel sysplex have an interesting option that's not available in a standalone DB2 environment: they can use one or more members of the data sharing group exclusively for BI work, while other members process the data requests of OLTP programs. This arrangement can virtually eliminate OLTP-BI workload contention with respect to server memory and server processing capacity. Combine that with the use of separate tables for OLTP and BI, and voila: you have the two different workloads executing on the same DB2 system (the data sharing group) in a most non-contentious way.&lt;br /&gt;&lt;br /&gt;Whether you're using DB2 in standalone or in data sharing mode, an important aspect of managing CPU contention between OLTP and BI applications running on the same system is the management of DB2 query parallelization, particularly as it pertains to the BI queries. I'll cover that topic in the part 2 companion to this blog entry, which I'll post within the next few days. As we say here in Georgia, y'all come back.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-6676360253329026637?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/6676360253329026637/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/08/oltp-and-bi-on-same-db2-for-zos-system.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6676360253329026637'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6676360253329026637'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/08/oltp-and-bi-on-same-db2-for-zos-system.html' title='OLTP and BI on the Same DB2 for z/OS System (Part 1)'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-3286706524754617499</id><published>2009-08-14T14:37:00.000-07:00</published><updated>2009-08-14T19:52:37.411-07:00</updated><title type='text'>DB2 for z/OS Stored Procedures and Large Parameters</title><content type='html'>&lt;span style="font-family:arial;"&gt;I recently fielded a question, from a DB2 person who works for an insurance company, about the performance implications of defining large parameters for DB2 for z/OS stored procedures. At this company, DB2 stored procedures written in COBOL had been created with PARAMETER STYLE GENERAL. Now, for the first time, a stored procedure with really large parameters (VARCHAR(32700), to be precise) was to be developed, and the DB2 person was concerned that the entire 32,700 bytes of such a parameter would be passed between the calling program and the stored procedure, even when the actual value of the parameter was an empty string (i.e., a string of length zero). He was thinking that the only way to avoid passing 32,700 bytes to a called stored procedure in cases when no value was to be assigned to a parameter would be to create the stored procedure with PARAMETER STYLE GENERAL WITH NULLS, and have the calling program set the value of the parameter to NULL by way of a null indicator.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;I'll tell you that I had been under this same impression for some time, based on my reading of material in various DB2 manuals; however, it turns out that passing an unnecessarily large number of bytes between a calling program and a stored procedure is &lt;span style="font-weight: bold;"&gt;not&lt;/span&gt; a concern when the stored procedure is defined with PARAMETER STYLE GENERAL. I like the way that Tom Miller, a stored procedures expert with IBM's DB2 for z/OS development organization, explains this. He pictures a VARCHAR data type as being a box, in which you have two bytes for a length indicator and then N bytes of what you could call the defined data area. When a program calls a stored procedure that has a VARCHAR-defined parameter, only "active" data in the parameter is passed to DB2. The "inactive" portion of the VARCHAR "box" - that being the part of the defined data area past the current data length (as specified in the 2-byte length portion of the "box") - is undefined and therefore ignored. So, if a program sets the length of a VARCHAR-defined parameter to zero and then calls the associated stored procedure, only an empty string will be passed to DB2 (in other words, the entire VARCHAR "box" will NOT be passed).&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;This being the case, the choice of PARAMETER STYLE GENERAL or PARAMETER STYLE GENERAL WITH NULLS comes down to addressing the needs of the application developer. PARAMETER STYLE GENERAL WITH NULLS can result in slightly higher overhead versus PARAMETER STYLE GENERAL (it costs a bit more to pass a null value than it does to pass a zero-length VARCHAR), but PARAMETER STYLE GENERAL WITH NULLS can accommodate the full range of possibilities with respect to the values that can be passed with parameters, including the null value. In this light, PARAMETER STYLE GENERAL can be seen as an application simplification choice for programs that don't need to deal with null values.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;I have to tip my hat to the DB2 person who sent this question to me, because before I was able to get back to him with a response, he had already executed a test that demonstrated the actual behavior, in terms of bytes passed, for a COBOL stored procedure created with a VARCHAR parameter (specified as INOUT, meaning that it would be used as both an input and an output parameter) and PARAMETER STYLE GENERAL (the calling program, in the case of this test, was another COBOL program, and it invoked the stored procedure by way of an SQL CALL - NOT a COBOL subroutine CALL). The calling program set the VARCHAR parameter to a long string of G's, then set the length of the VARCHAR to zero and called the stored procedure. The stored procedure displayed the length of the parameter and its contents, then set the value of the parameter to a long string of asterisks, set the length of the parameter to 5, and returned to the calling program. The calling program then displayed the length and contents of the parameter. Sure enough, the stored procedure displayed a parameter length of 0 (as set by the calling program) and no content, and the calling program (following return from the stored procedure) displayed a parameter length of 5 (as set by the stored procedure) and content consisting of 5 asterisks followed by a lot of G's (the 5 asterisks returned by the stored procedure overlaid the first 5 of the G's that had initially been moved to the VARCHAR "box" in the calling program).&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;[Note that there is a caveat with respect to parameters in a REXX program. These never arrive with any undefined portion (with any "air", as the previously cited Tom Miller of IBM puts it), as there is no concept of that in REXX variables.]&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;I hope that this information will be helpful to your own DB2 stored procedure development efforts. As I've mentioned in other posts to this blog, I'm very big on DB2 stored procedures, and I want people to use them effectively and successfully.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-3286706524754617499?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/3286706524754617499/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/08/db2-for-zos-stored-procedures-and-large.html#comment-form' title='12 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3286706524754617499'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3286706524754617499'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/08/db2-for-zos-stored-procedures-and-large.html' title='DB2 for z/OS Stored Procedures and Large Parameters'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>12</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-6776806029135867421</id><published>2009-07-30T08:05:00.000-07:00</published><updated>2011-09-20T08:31:00.773-07:00</updated><title type='text'>SQL Programming Nugget: User-Defined Functions and Data Types</title><content type='html'>&lt;span style="font-family:arial;"&gt;Earlier this week, I posted &lt;a href="http://catterallconsulting.blogspot.com/2009/07/sql-programming-nugget-sqlstate-and.html"&gt;an entry to this blog&lt;/a&gt; that described a "lesson learned" experience in SQL programming (having to do with the use of a WHILE loop in a SQL stored procedure to process rows in a result set generated and returned by a nested stored procedure).  Today, I'll relate another such experience to you. This time, it has to do with user-defined functions (UDFs) and associated data types. As with my previous "SQL programming nugget" post (see the link above), this one describes a situation that I encountered in a DB2 for Linux/UNIX/Windows (LUW) environment, but the same would have occurred in a DB2 for z/OS system.&lt;br /&gt;&lt;br /&gt;OK, here's the deal: I was working on the development of a UDF that would take as an input argument a fixed-length character string of 3 bytes.  Accordingly, the first part of the CREATE FUNCTION statement looked like this:&lt;br /&gt;&lt;br /&gt;CREATE FUNCTION MYFUNC(ARG CHAR(3))&lt;br /&gt;&lt;br /&gt;I filled in the rest (RETURNS information, variable declarations, function logic, etc.), submitted the statement, and got a message indicating successful creation of the UDF. Then, to test the UDF, I invoked it via the Command Editor, passing a 3-byte character string constant as the input argument, as follows:&lt;br /&gt;&lt;br /&gt;SELECT MYFUNC('A01') ...&lt;br /&gt;FROM ...&lt;br /&gt;&lt;br /&gt;Execution of this SELECT statement resulted in an error message indicating that MYFUNC could not be found (SQLSTATE 42884). Huh? Not found? What do you mean, "not found?" I just created it! I checked the SYSCAT.ROUTINES catalog view, and there was my UDF. If I could find it, why couldn't DB2?&lt;br /&gt;&lt;br /&gt;After I'd racked my brain for a while, it finally occurred to me that I might have introduced a data type mismatch when I passed to my UDF a character string constant. See, to be found by DB2 when invoked, a UDF has to be named properly in the SELECT statement (I did that), and it has to be passed the right number of input arguments (I did that), &lt;span style="font-style: italic;"&gt;and the data type of the input argument(s) has to match, or be promotable to, the data type of the corresponding function parameter(s)&lt;/span&gt; (oops). I was thinking "matched data type" when I passed the character string constant 'A01', because that's a 3-byte character string and my UDF was defined as taking a CHAR(3) value as input. What I failed to consider is this: a character string constant is treated by DB2 as a varying-length value (i.e., a VARCHAR data type). That did not match the CHAR data type for the input parameter specified in the definition of my UDF. Furthermore, VARCHAR is not promotable to CHAR (to see what data types are promotable to others, refer to "Promotion of Data Types" in the DB2 for LUW SQL Reference, Volume 1, or &lt;/span&gt;&lt;span style="font-family:arial;"&gt;the DB2 for z/OS SQL Reference&lt;/span&gt;&lt;span style="font-family:arial;"&gt;).&lt;br /&gt;&lt;br /&gt;To verify my understanding of what had gone wrong, I tried invoking my UDF in this manner:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;SELECT MYFUNC(CHAR('A01')) ...&lt;br /&gt;FROM ...&lt;br /&gt;&lt;br /&gt;Sure enough, this worked, because the CHAR function &lt;span style="font-style: italic;"&gt;returns a fixed-length representation&lt;/span&gt; of a character string. I also successfully invoked my UDF by passing to it a value from a table column that was created with a CHAR(3) data type specification (in other words, supposing that COL1 of table ABC was defined as CHAR(3), this invocation of my UDF worked: SELECT MYFUNC(COL1) FROM ABC). Finally, I created a UDF that was exactly like MYFUNC, except that the input parameter was defined as VARCHAR(3) instead of CHAR(3). When this modified UDF was invoked with a 3-character string constant passed as the input argument, it was found by DB2 and it worked just fine.&lt;br /&gt;&lt;br /&gt;So, when you are creating a UDF, make sure that the data types of any input arguments will match - or can be promoted to - the data type as specified in the corresponding input parameter in the CREATE FUNCTION statement. &lt;span style="font-weight: bold;"&gt;Important exception to this rule: &lt;/span&gt;&lt;span style="font-style: italic;"&gt;if you are using DB2 9.7 for LUW&lt;/span&gt; (the current release, which became generally available just last month), be aware of an enhancement with respect to function resolution (i.e., the process of finding the right function when a UDF is invoked). In a DB2 9.7 environment, if the regular function resolution process (the one I've described) fails to find an "exact match" for an invoked UDF, and if there is a function with the right name and the right number of parameters but with an inexact data type match in terms of input arguments and corresponding input parameters (as specified in the CREATE FUNCTION statement), &lt;span style="font-style: italic;"&gt;the arguments will be converted to the data types of the parameters using the rules that apply to assignment of data values to columns.&lt;/span&gt; From a value-assignment perspective, VARCHAR is compatible with CHAR (refer to "Assignments and Comparisons" in the DB2 for LUW SQL Reference, Volume 1, or "Assignment and Comparison" in the DB2 for z/OS SQL Reference). That being the case, the "not found" situation that I encountered in a DB2 9.1 for LUW system when I passed a VARCHAR argument to a UDF defined with a CHAR input parameter (and the same thing would have happened in a DB2 9.5 system or a DB2 9 for z/OS environment) would NOT have occurred in a DB2 9.7 system. DB2 9.7 would have found and utilized my UDF.&lt;br /&gt;&lt;br /&gt;Isn't that nice? One less thing to worry about. Keep those DB2 enhancements coming, IBM.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-6776806029135867421?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/6776806029135867421/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/07/sql-programming-nugget-user-defined.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6776806029135867421'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6776806029135867421'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/07/sql-programming-nugget-user-defined.html' title='SQL Programming Nugget: User-Defined Functions and Data Types'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-3736757412437644688</id><published>2009-07-27T07:26:00.000-07:00</published><updated>2011-09-20T08:29:16.663-07:00</updated><title type='text'>A SQL Programming Nugget: SQLSTATE and WHILE loops</title><content type='html'>&lt;span style="font-family:arial;"&gt;Last week I taught an introduction to DB2 application development course at a large financial institution. That gave me the opportunity to share with the students some of my own experiences in the area of SQL programming. I'll share one of these "school of hard knocks" experiences with you via this blog entry, and I'll describe another in a subsequent post within the next few days. The situations I'll cover in these two entries (this one and the one to follow) occurred in a DB2 for Linux/UNIX/Windows (LUW) environment, but I believe that the lessons learned apply equally to DB2 for z/OS systems.&lt;br /&gt;&lt;br /&gt;On, now, to SQL programming experience number one.&lt;br /&gt;&lt;br /&gt;I was working on the development of a SQL stored procedure that would process a query result set generated by another stored procedure (a SQL stored procedure, by the way, is a stored procedure written using SQL procedure language, aka SQL PL - I've blogged a number of times on this topic, including &lt;a href="http://catterallconsulting.blogspot.com/2008/02/sql-programming-language.html"&gt;an overview treatment&lt;/a&gt; posted last year). In other words, I was writing stored procedure A, which was to call stored procedure B. Stored procedure B, defined with DYNAMIC RESULT SETS 1 (indicating generation of a result set that would be consumed by a calling program), declared a cursor WITH RETURN TO CALLER (also necessary to make the query result set available to the calling program), opened that cursor, and returned control to stored procedure A. In stored procedure A I declared a result set locator variable, used an ASSOCIATE RESULT SET LOCATOR statement to get the location in memory of the result set generated by stored procedure B and assign it to the aforementioned result set locator variable, and allocated a cursor to enable the fetching of rows from the result set. So far, so good.&lt;br /&gt;&lt;br /&gt;I decided to use a WHILE loop in stored procedure A to retrieve and act on each row in the result set generated by stored procedure B. Wanting to exit upon reaching the end of the result set (i.e., after FETCHing the last row), I started the loop with WHILE SQLSTATE = '00000'  DO (having declared SQLSTATE - a return code value set by DB2 following execution of an SQL statement - in my SQL procedure and initialized it to '00000'), thinking that SQLSTATE would go to '02000' upon execution of the first FETCH statement following retrieval of the last row in the result set. Indeed, SQLSTATE is set to '02000' in that situation, but I messed up. How? By not making FETCH the last statement in the WHILE LOOP. See, I followed the FETCH in my loop with a SET statement that concatenated the value of a variable with a character string. That statement was always successful (indicated by SQLSTATE = '00000'), so, SQLSTATE-wise, here's what happened when the WHILE LOOP was executed following the retrieval of the last row in the result set:&lt;br /&gt;&lt;br /&gt;FETCH &lt;span style="font-style: italic;"&gt;cursor-name&lt;/span&gt; INTO &lt;span style="font-style: italic;"&gt;variable1, variable2&lt;/span&gt;, ...; &lt;span style="color: rgb(0, 153, 0);"&gt;--SQLSTATE goes to '02000'&lt;/span&gt;&lt;br /&gt;SET &lt;span style="font-style: italic;"&gt;variable_xyz&lt;/span&gt; = &lt;span style="font-style: italic;"&gt;variable_xyz&lt;/span&gt; CONCAT &lt;span style="font-style: italic;"&gt;variable1&lt;/span&gt;; &lt;span style="color: rgb(0, 153, 0);"&gt;--SQLSTATE goes back to '00000'&lt;br /&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;END WHILE;&lt;br /&gt;&lt;br /&gt;Thus, SQLSTATE was always '00000' when the end of my WHILE loop was reached, and the loop-evaluation condition of SQLSTATE = '00000' (the condition for initial entry into the loop and for loop reiteration) was always evaluated as "true". The result was an endless loop. Fortunately, I ran this SQL procedure on an instance of DB2 on my laptop, so I was the only person impacted. I got rid of the looping procedure, corrected the logic error, and went on from there, having learned a good lesson.&lt;br /&gt;&lt;br /&gt;So, if you write a SQL stored procedure that processes a result set generated by another stored procedure, here are a couple of things to consider:&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="color: rgb(0, 153, 0);"&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;A WHILE loop is fine for FETCHing through that result set, and you can even make SQLSTATE = '00000' the loop-evaluation condition - just &lt;span style="font-weight: bold;"&gt;MAKE SURE&lt;/span&gt; that FETCH is the last statement in the WHILE LOOP (before END WHILE, that is), if your intention in coding WHILE SQLSTATE = '00000' DO was to have DB2 exit the loop upon execution of the first FETCH following retrieval of the last row in the result set.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="color: rgb(0, 153, 0);"&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;Alternatively, consider &lt;span style="font-style: italic;"&gt;indirectly&lt;/span&gt; using the SQLSTATE value to establish an exit-the-loop condition. This could be done by coding WHILE exit_ind = 0 DO (with exit_ind being a variable declared in your SQL procedure and initially set to 0), and then following the FETCH in the loop with IF SQLSTATE = '02000' THEN SET exit_ind = 1 (or IF SQLSTATE &amp;lt;&amp;gt; '00000' if you want to be more generic).&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="color: rgb(0, 153, 0);"&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;If, in your SQL stored procedure, you will be processing a result set generated by a SELECT statement in your procedure (as opposed to a result set generated by a cursor declared and opened in a stored procedure called by your stored procedure), consider using the FOR statement instead of a WHILE loop. A nice thing about FOR is that you will automatically exit the FOR loop after you've retrieved all the rows in the result set. Note that you could use FOR to process a result set generated by a stored procedure called by your stored procedure (let's call the lower-level program stored procedure B), if stored procedure B inserts the result set rows into a temporary table that is subsequently referenced in the SELECT statement in the FOR loop coded in your stored procedure.&lt;br /&gt;&lt;br /&gt;I hope that your organization is using SQL stored procedures to access DB2 data -  I'm a big fan of this approach. Check back in a few days for another SQL programming nugget - this one involving user-defined functions.&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-3736757412437644688?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/3736757412437644688/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/07/sql-programming-nugget-sqlstate-and.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3736757412437644688'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3736757412437644688'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/07/sql-programming-nugget-sqlstate-and.html' title='A SQL Programming Nugget: SQLSTATE and WHILE loops'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-26999311048776005</id><published>2009-07-10T06:31:00.000-07:00</published><updated>2011-09-20T08:25:49.067-07:00</updated><title type='text'>Migrating DB2 9 for z/OS Native SQL Procedures</title><content type='html'>&lt;span style="font-family:arial;"&gt;Last week, a friend who works for a big DB2-using company in the American Midwest sent me a question concerning the migration of native SQL procedures from DB2 subsystem A to subsystem B (e.g., from a development to a test subsystem, or from test to production). This person's organization had been using DB2 for z/OS stored procedures for some time, and they had an automated process - utilizing IBM's Software Configuration and Library Manager product (SCLM) - that took care of moving a mainframe application program and associated items (load module, source code, control cards, etc.) from one DB2 environment to another. The company is in the process of going to DB2 for z/OS Version 9 from Version 8, and my friend and his colleagues are very much interested in taking advantage of the native SQL procedure functionality that's available in DB2 9 New Function Mode (I've done a good bit of blogging about DB2 9 native SQL procedures, &lt;a href="http://catterallconsulting.blogspot.com/2008/11/db2-9-for-zos-stored-procedure-game.html"&gt;starting with an overview-type entry&lt;/a&gt; posted last fall). There is plenty to get excited about here, including improved CPU efficiency versus external SQL procedures and excellent utilization of cost-effective zIIP engines on System z servers (when invoked through remote CALLs coming through the DB2 Distributed Data Facility), but people focused on the infrastructure of mainframe application systems may well wonder (as did my friend): given that a DB2 9 native SQL procedure is not dependent on external-to-DB2 items such as load and source modules (the executable is a DB2 package, and the source CREATE PROCEDURE statement is stored in the SYSROUTINES table in the DB2 catalog), how does one manage the migration of these stored procedures from (for example) test to production?&lt;br /&gt;&lt;br /&gt;It turns out that new DB2 9 functionality, which gave rise to this question, also provides the solution in the form of two enhancements:&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;The new DEPLOY option of the BIND PACKAGE command.&lt;/span&gt; This new option lets you migrate a native SQL stored procedure (and a specific version of same, at that) from one DB2 for z/OS subsystem to another. DEPLOY essentially extends the functionality of a remote BIND command (that is, BIND to a remote subsystem), enabling you to add or replace a version of a native SQL procedure on the target remote subsystem from the current-location subsystem.  DEPLOY does NOT change the logic portion of the native SQL procedure, which is stored in a special section of the package (the sections pertaining to SQL DML statements will of course be generated anew on the target remote system, so that you'll get appropriate access paths and such). Note that &lt;/span&gt;&lt;span style="font-family:arial;"&gt;when you migrate a native SQL procedure using BIND PACKAGE with DEPLOY, &lt;/span&gt;&lt;span style="font-family:arial;"&gt;you can change the qualifier used to resolve references to unqualified database objects named in SQL DML statements in the procedure.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;New options for the ALTER PROCEDURE statement.&lt;/span&gt; You can issue ALTER PROCEDURE with the ACTIVATE VERSION option to make a particular version of a native SQL procedure the currently active version on a subsystem, so in the event that several versions of the native SQL procedure exist on the subsystem, the one identified via ALTER PROCEDURE ACTIVATE VERSION will be the one executed when a CALL to that procedure is executed (this default active version designation can be overridden at a thread level via the new CURRENT ROUTINE VERSION special register).  ALTER PROCEDURE can also be used to drop a version of a native SQL procedure.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-family:arial;"&gt;So, yes, when IBM's DB2 for z/OS development team was working on native SQL procedures for Version 9, they thought about mundane (though very important) matters such as inter-subsystem migration, as well as more-cool things like reduced execution instruction pathlength and nested compound SQL statements (the latter enabling, among other things, more sophisticated error-handling logic versus what you can achieve with an external SQL stored procedure). If you want to read more about native SQL procedure migration, check out the IBM "red book" titled &lt;a href="http://www.redbooks.ibm.com/abstracts/sg247604.html?Open"&gt;"DB2 9 for z/OS Stored Procedures: Through the CALL and Beyond,"&lt;/a&gt; the DB2 9 for z/OS &lt;a href="http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=/com.ibm.db29.doc.comref/db2z_comref.htm"&gt;&lt;span style="font-style: italic;"&gt;Command Reference&lt;/span&gt;&lt;/a&gt; (for BIND PACKAGE with DEPLOY), and the DB2 9 for z/OS &lt;a href="http://publib.boulder.ibm.com/infocenter/dzichelp/v2r2/index.jsp?topic=/com.ibm.db29.doc.sqlref/db2z_sqlref.htm"&gt;&lt;span style="font-style: italic;"&gt;SQL Reference&lt;/span&gt;&lt;/a&gt; (for ALTER PROCEDURE with ACTIVATE VERSION).&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-26999311048776005?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/26999311048776005/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/07/migrating-db2-9-for-zos-native-sql.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/26999311048776005'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/26999311048776005'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/07/migrating-db2-9-for-zos-native-sql.html' title='Migrating DB2 9 for z/OS Native SQL Procedures'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-3016692148288205905</id><published>2009-07-01T12:15:00.000-07:00</published><updated>2009-08-23T18:58:46.204-07:00</updated><title type='text'>Outer Join: Get the Predicates Right</title><content type='html'>&lt;span style="font-family:arial;"&gt;A few days ago, I was working with a team of people from a large company, trying to improve the performance of some queries executing in a DB2 for z/OS-based data warehouse environment. One query in particular was running much longer than desired, and consuming a lot of CPU time, to boot. One of the team members noticed that the problem query, which involved several table-join operations, had a rather odd characteristic: no WHERE-clause predicates. All the predicates were in the ON clauses of the joins. In fact, there was even an inner join of table TAB_A (I won't use the real table names) with SYSIBM.SYSDUMMY1 (which of course contains nothing), with two ON predicates referencing columns in TAB_A, like this:&lt;br /&gt;&lt;br /&gt;SELECT...&lt;br /&gt;FROM TAB_A&lt;br /&gt;INNER JOIN SYSIBM.SYSDUMMY1&lt;br /&gt;ON TAB_A.COL1 = 12&lt;br /&gt;AND TAB_A.COL2 = 'X'&lt;br /&gt;...&lt;br /&gt;&lt;br /&gt;One of the application developers (the queries we were analyzing are report-generating SELECT statements built and issued by application programs) removed this inner join to SYSDUMMY1 and changed the two ON-clause predicates to WHERE-clause predicates, and the query's elapsed and CPU times went way down.&lt;br /&gt;&lt;br /&gt;We were left thinking that this Cartesian join (i.e., a join with no join columns specified) to SYSDUMMY1 might reflect someone's thinking that WHERE-clause predicates are not needed in table-join SELECT statements. In fact, the use of WHERE-clause predicates versus ON-clause predicates in table-join statements can have a very significant impact on query performance. We looked at another long-running query in the aforementioned data warehouse application, and this one also involved a join operation and also had no WHERE-clause predicates.  Importantly, the join was a left outer join, and the ON clause included multiple predicates that referenced columns of the left-side table &lt;/span&gt;&lt;span style="font-family:arial;"&gt;(the table from which we want rows for the result set, regardless of whether or not there are matching right-side table rows)&lt;/span&gt;&lt;span style="font-family:arial;"&gt;.  A DBA took one of these left-side-table-referencing ON-clause predicates and made it a WHERE predicate, too.  In other words, an ON-clause predicate like TAB_L.COL2 = 5 (with TAB_L being the left-side table in the left outer join operation) was added to the query in the form of a WHERE-clause predicate. The result? Response time for the query went from 10 minutes to less than 1 second.&lt;br /&gt;&lt;br /&gt;Why did the query's performance improve so dramatically, when all the DBA did was make an ON-clause predicate a WHERE-clause predicate? Simple: for a left outer join, a WHERE-clause predicate that references a column of the left-side table will filter rows from that table. That same predicate, if coded in the ON-clause of the SELECT statement, will NOT filter left-side table rows. Instead, that predicate will just affect the matching of left-side table rows with right-side table rows. To illustrate this point, consider the predicate mentioned in the preceding paragraph:&lt;br /&gt;&lt;br /&gt;TAB_L.COL2 = 5&lt;br /&gt;&lt;br /&gt;Suppose this predicate is included in the query in the following way:&lt;br /&gt;&lt;br /&gt;SELECT TAB_L.COL1, TAB_L.COL2, TAB_L.COL3, TAB_R.COL4&lt;br /&gt;FROM TAB_L&lt;br /&gt;LEFT OUTER JOIN TAB_R&lt;br /&gt;ON&lt;br /&gt;TAB_L.COL3 = TAB_R.COL3&lt;br /&gt;&lt;span style="color: rgb(51, 51, 255); font-weight: bold;"&gt;AND TAB_L.COL2 = 5&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;With no WHERE-clause predicates in this query, all rows from TAB_L will qualify -  none will be filtered out.  What the TAB_L.COL2 = 5 predicate will do is affect row matching with TAB_R: if a row in TAB_R has a COL3 value that matches the value of COL3 in TAB_L, &lt;span style="font-style: italic;"&gt;and&lt;/span&gt; if the value of COL2 in that row is 5, the TAB_R row (specifically, COL4 of that row, as specified in the query's SELECT-list) will be joined to the TAB_L row in the query result set; otherwise, DB2 will determine that the TAB_R row is not a match for any TAB_L rows (and the TAB_L rows without TAB_R matches will appear in the result set with the null value in the TAB_R.COL4 column).&lt;br /&gt;&lt;br /&gt;Now, suppose that the same predicate is specified in a WHERE clause of the query, as follows:&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;SELECT TAB_L.COL1, TAB_L.COL2, TAB_L.COL3, TAB_R.COL4&lt;br /&gt;FROM TAB_L&lt;br /&gt;LEFT OUTER JOIN TAB_R&lt;br /&gt;ON&lt;br /&gt;TAB_L.COL3 = TAB_R.COL3&lt;br /&gt;&lt;span style="color: rgb(51, 51, 255); font-weight: bold;"&gt;WHERE TAB_L.COL2 = 5&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;In this case, the predicate will be applied to TAB_L &lt;span style="font-style: italic;"&gt;before&lt;/span&gt; the join operation, potentially filtering out a high percentage of TAB_L rows (as was the case for the query cited earlier that went from a 10-minute to a sub-second run time).&lt;br /&gt;&lt;br /&gt;So, a person codes a predicate that references a column of the left-side table of a left outer join operation, and places that predicate in an ON clause of the query, versus a WHERE clause &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;(and I'm not talking about a join predicate of the form TAB_L.COLn = TAB_R.COLn, which you expect to see in an ON clause). Is it possible that the query-writer &lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="color: rgb(51, 51, 255);"&gt;&lt;span style="color: rgb(0, 0, 0);"&gt;actually &lt;span style="font-style: italic;"&gt;wants&lt;/span&gt; the result described above, to wit: no filtering of left-side table rows, and a further condition as to what constitutes a right-side table match? Yes, that's possible, but there's a very good chance that this person mistakenly placed the predicate in an ON clause because he (or she) &lt;span style="font-style: italic;"&gt;thought&lt;/span&gt; that this would have the same effect as coding the predicate in a WHERE clause. Mistakes of this type are fairly common because misunderstanding with respect to the effect of predicates in outer join queries is quite widespread. Patrick Bossman, a good friend who is a query optimization expert with IBM's DB2 for z/OS development organization, pointed out as much to me in a recent e-mail exchange. Patrick also sent me the links to two outstanding articles written by Terry Purcell, a leader on the IBM DB2 for z/OS optimizer team. These articles (actually, a &lt;a href="http://www.ibm.com/developerworks/db2/library/techarticle/purcell/0112purcell.html"&gt;part-one&lt;/a&gt; and &lt;a href="http://www.ibm.com/developerworks/db2/library/techarticle/purcell/0201purcell.html"&gt;part-two&lt;/a&gt; description of outer join predicates and their effects on query result sets) were written a few years ago, while Terry was with DB2 consultancy Yevich, Lawson, and Associates, but the content is still very much valid today (Patrick considers the articles to be "a must-read for folks writing outer joins").  Check 'em out.&lt;br /&gt;&lt;br /&gt;Outer join is a powerful SQL capability that is widely used in DB2 environments. If you code outer join queries (or if you review such queries written by others), make sure that you use ON-clause and WHERE-clause predicates appropriately, so as to get the right result (job one) and the best performance (job two, right behind job one).&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-3016692148288205905?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/3016692148288205905/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/07/outer-join-get-predicates-right.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3016692148288205905'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/3016692148288205905'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/07/outer-join-get-predicates-right.html' title='Outer Join: Get the Predicates Right'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-6323118592438332004</id><published>2009-06-24T15:11:00.000-07:00</published><updated>2011-09-20T08:23:38.631-07:00</updated><title type='text'>DB2 9 Native SQL Procedures: One Address Space is OK</title><content type='html'>&lt;span style="font-family:arial;"&gt;I've been doing a lot of presenting lately on the topic of DB2 for z/OS stored procedures. In these presentations, I've emphasized the benefits of native SQL procedures, introduced for the mainframe platform via DB2 9 for z/OS (I blogged on the importance of this development &lt;a href="http://catterallconsulting.blogspot.com/2008/11/db2-9-for-zos-stored-procedure-game.html"&gt;in an entry I posted late last year&lt;/a&gt;). During two different sessions held recently in two different cities, two different people asked me the same question pertaining to native SQL procedures versus external stored procedures (the latter being what you might think of as "traditional" stored procedures in a DB2 for z/OS environment). In this entry, I'll share with you that question and my response.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;First, the common question:&lt;/span&gt; "When DB2 for z/OS Version 5 provided support for stored procedure address spaces managed by the Workload Manager (WLM) component of z/OS, we were told that an advantage of this enhancement was the ability to have multiple stored procedure address spaces, versus the one DB2-managed stored procedures address space (SPAS). With different stored procedures assigned to different WLM application environments and their associated address spaces, if a stored procedure program misbehaved in such a way as to bring down the address space in which it was running, the other stored procedure address spaces would not be impacted. Now, with DB2 9 native SQL procedures, we're back to one address space for stored procedure execution (native SQL procedures execute in the DB2 database services address space, also known as DBM1). Doesn't that mean that we now have the same risk we faced when using the old DB2-managed SPAS, namely, that one errant stored procedure could take down the one stored procedure address space (and this time, we're talking about losing DBM1)?"&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;My response:&lt;/span&gt; It's true that having multiple WLM-managed stored procedure address spaces can reduce the impact of an address space failure caused by an external stored procedure program, but that kind of failure has to do with stored procedure program code executing outside of DB2. Multiple DB2-accessing stored procedures running in in multiple stored procedure address spaces are all executing code in DBM1 when they issue SQL statements (as is true of multiple DB2-accessing CICS transaction programs running in multiple CICS AORs), and that doesn't cause DBM1 to crash. Native SQL procedures running in DBM1 execute as packages. It's all DB2-generated and DB2-managed code. This means that the exposure mitigated by having multiple WLM-managed stored procedure address spaces - that user-written stored procedure program code running outside of DB2 could cause a problem that would lead to the failure of a WLM-managed address space - does not exist for native SQL procedures. To put it another way, having native SQL procedures executing in one address space - DBM1 - is no more risky than having multiple packages invoked by external callers all running in DBM1&lt;/span&gt;&lt;span style="font-family:arial;"&gt;, and that's been standard operating procedure for DB2 since day one (&lt;/span&gt;&lt;span style="font-family:arial;"&gt;execution of an embedded SQL statement involves, under the covers, a call to DB2 and a reference to a section of a package&lt;/span&gt;&lt;span style="font-family:arial;"&gt;).&lt;br /&gt;&lt;br /&gt;So, take advantage of the enhanced performance and simplified lifecycle management offered by DB2 9 native SQL procedures, and don't worry about not having multiple address spaces in which to run these stored procedures - you don't need them.  Native SQL procedures are made up of SQL statements, and SQL statements - as always - run in DBM1.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-6323118592438332004?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/6323118592438332004/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/06/db2-9-native-sql-procedures-one-address.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6323118592438332004'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/6323118592438332004'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/06/db2-9-native-sql-procedures-one-address.html' title='DB2 9 Native SQL Procedures: One Address Space is OK'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-8646698166159687610</id><published>2009-06-18T10:45:00.000-07:00</published><updated>2009-06-18T20:15:21.824-07:00</updated><title type='text'>DB2, Stored Procedures, COBOL, and Result Sets</title><content type='html'>&lt;span style="font-family:arial;"&gt;Last week, I helped a DBA at a large financial services firm with a couple of questions related to DB2 for z/OS stored procedures and result sets (referring to the row and column information accessed via a query included in a DECLARE CURSOR statement). Both of the DBA's questions had to do with COBOL programs called by DB2 stored procedures. A lot of mainframe sites have COBOL programs that are invoked by stored procedures, and plenty of people have some misconceptions with regard to the accessibility of result sets generated through cursors in stored-procedure-called programs, so in this post I'll share the information that I provided to the aforementioned DBA.&lt;br /&gt;&lt;br /&gt;When it comes to DB2 for z/OS stored procedures calling COBOL programs, the situation is most interesting when both the stored procedure program and the program called by the stored procedure are written in COBOL.  In that case, you have a couple of viable options: the COBOL stored procedure program could invoke the target COBOL program by way of an SQL CALL statement (the target program would run as a nested stored procedure, assuming that it had been set up to execute that way through a CREATE PROCEDURE statement referencing the program name), or it could execute the target program through a COBOL CALL (so that the target would run as a COBOL subroutine).  Either way, both the COBOL stored procedure program and the target COBOL program would run in a WLM-managed stored procedure address space - the same address space if the target program executes as a COBOL subroutine, and the same or a different address space if the target is invoked via SQL CALL, depending on whether or not the same or a different WLM application environment was specified in the target's CREATE PROCEDURE statement (if you are using the DB2-managed stored procedure address space, get away from that and into WLM-managed address spaces soon - the DB2-managed space is not supported in a DB2 9 environment).&lt;br /&gt;&lt;br /&gt;One of the differences between the nested-procedure and COBOL subroutine scenarios has to to with task control blocks (TCBs).  If the target COBOL program is invoked via SQL CALL, it will run under its own TCB.  If the target is executed as a COBOL subroutine, it will run under the TCB of the stored procedure program that issued the COBOL CALL for the subroutine.&lt;br /&gt;&lt;br /&gt;Another difference - the one with which the question-asking DBA was concerned - has to do with access to a result set defined by a cursor declared in the target COBOL program. The DBA first brought up a situation in which a program running on an off-mainframe application server was seemingly able to fetch rows from a cursor declared in a "two levels down" stored procedure: the mid-tier program calls DB2 for z/OS COBOL stored procedure A, and stored procedure A calls COBOL stored procedure B. Stored procedure B issues a DECLARE CURSOR statement (on which the WITH RETURN option is specified) and opens this cursor. The mid-tier program subsequently fetches the result set rows associated with the cursor declared and opened in stored procedure B. That was working, but it shouldn't have been, because a DB2 for z/OS stored procedure generating a result set can return that result set only one level up within a series of nested calls. In other words, if stored procedure B declares and opens a cursor, stored procedure A (which called B via SQL CALL) can fetch rows from that cursor-defined result set by issuing an ASSOCIATE LOCATOR statement to get the locator value for the result set, and an ALLOCATE CURSOR statement to define a cursor and associate it with the result set locator value. If the program that called stored procedure A wants to retrieve the result set generated by the cursor declared in stored procedure B, it cannot use this ASSOCIATE LOCATOR/ALLOCATE CURSOR mechanism, because that mechanism only works one level up in a nested SQL CALL structure (DB2 for Linux, UNIX, and Windows allows result-set retrieval at both the one-level-up level and at the top level of a nested SQL CALL structure - "top" referring to the program that issued the initial CALL to a stored procedure).&lt;br /&gt;&lt;br /&gt;So, how was the mid-tier program mentioned by the DBA able to get the two-levels-down result set generated by stored procedure B? Upon further investigation, the DBA found that stored procedure B, in addition to declaring and opening a cursor defining a result set, inserted the result set rows into a global temporary table (these come in two flavors, declared temporary tables and created temporary tables, with the latter usually being the best choice in terms of performance). Stored procedure A then declared and opened a cursor (WITH RETURN) referencing this global temporary table, and the mid-tier program (caller of stored procedure A) could then access the result set because it (the mid-tier program) was only one level up from stored procedure A. That's in fact an excellent way to make a stored procedure-generated result set available to a program several levels up in the nested call structure: put the result set in a global temporary table.&lt;br /&gt;&lt;br /&gt;So, we had one mystery solved. The DBA then pointed to another situation that had him scratching his head: a program (again running on an off-mainframe middle tier) called COBOL stored procedure X, stored procedure X invoked COBOL subroutine Y via COBOL CALL, and the middle tier program was subsequently able to access a result set generated through a cursor declared (WITH RETURN) and opened by COBOL subroutine Y.&lt;br /&gt;&lt;br /&gt;This was actually a working-as-designed situation. The DBA was thinking that it shouldn't have worked, because he was under the impression that a result set generated by program Y could be returned if program Y were  invoked via SQL CALL, and could not be returned if program Y executed as a COBOL-called subroutine. You can in fact find passages in DB2 manuals and "red books" that appear to confirm this understanding of result set processing. It's not that the documentation is wrong - it's just that it can be easily misinterpreted if you consider it from a different perspective versus that of the documentation authors. Here's what I mean by that: when you read in a DB2 book that a COBOL-called subroutine cannot return a cursor-defined result set, what's being communicated is the fact that the subroutine can't return a result set to the program that invoked it via COBOL call. A subroutine called via COBOL call from a COBOL DB2 stored procedure program &lt;span style="font-style: italic;"&gt;can&lt;/span&gt; return a result set to the program that called the stored procedure. This is consistent with the result set processing mechanism I described above for nested stored procedures: a SQL-called stored procedure can pass a result set to a one-level-up program (i.e., to the program that called it). In the context of result set processing, a subroutine called via COBOL CALL from a stored procedure program runs at the same "level" as the calling stored procedure; therefore, a result set generated by that subroutine, while not accessible by the calling stored procedure, can be accessed by the caller of the stored procedure (i.e., by the "one level up" program). Just remember that the cursor declared in the subroutine has to include the WITH RETURN option, and the stored procedure invoking the subroutine has to be defined with DYNAMIC RESULT SETS 1 (or more than 1, if multiple result sets will be generated by the stored procedure program and/or by COBOL-called subroutines invoked by the stored procedure program).&lt;br /&gt;&lt;br /&gt;Is that clear? I hope so. I'm very big on DB2 stored procedures, and I want people to know how they can use them.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-8646698166159687610?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/8646698166159687610/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/06/db2-stored-procedures-cobol-and-result.html#comment-form' title='24 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/8646698166159687610'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/8646698166159687610'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/06/db2-stored-procedures-cobol-and-result.html' title='DB2, Stored Procedures, COBOL, and Result Sets'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>24</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-1393806336914889090</id><published>2009-06-08T06:20:00.000-07:00</published><updated>2011-09-20T08:21:08.414-07:00</updated><title type='text'>Thoughts on DB2 Triggers</title><content type='html'>&lt;span style="font-family:arial;"&gt;I was in the Upper Midwest of the USA for most of last week, presenting at three regional DB2 user group meetings -  in Minneapolis, Chicago, and Milwaukee -  on three consecutive days.  One of the presentations I gave in each city covered DB2 for z/OS data warehouse performance.  In that presentation, I provided some guidelines on the average number of indexes defined per table in a data warehouse database (I wrote of this in &lt;a href="http://catterallconsulting.blogspot.com/2008/09/db2-for-zos-data-warehousing-query.html"&gt;an entry posted to this blog last year&lt;/a&gt;).  Following the meeting in Milwaukee, one of the attendees asked me if I had any recommendations pertaining to the number of triggers defined on a table.  I don't, because trigger usage scenarios and environments vary so widely, but the question sparked an interesting discussion about DB2 triggers that covered a variety of sub-topics.  By way of this entry, I'll commit these DB2 trigger thoughts of mine to paper (electronically speaking).&lt;br /&gt;&lt;br /&gt;[Super-brief level-set: by way of a trigger defined on a DB2 table, one can cause an SQL-expressed action to be taken automatically in response to an update, delete, or insert targeting the base table.  For example, one could use a trigger defined on table A to cause an insert into table A to drive an update of a column in table B.]&lt;br /&gt;&lt;br /&gt;First, &lt;span style="font-weight: bold;"&gt;concerning that question on the number of triggers defined on a table&lt;/span&gt;, the answer is very much of the "it depends" variety.  I recall a presentation, delivered at a DB2 user group meeting several years ago, in which a developer described a new application that his company had implemented entirely by way of triggers.  The number of triggers created for that application was fairly large, and I'm thinking that quite a few triggers were defined on certain individual tables.  The application was successfully put into production, and everything worked fine, so having a lot of triggers is not necessarily a bad thing.  On the other hand, there are situations in which triggers can affect application performance in an undesirable way.  In that regard, the story has gotten better in recent years, certainly on the mainframe platform.  Triggers were introduced with DB2 Version 6 for z/OS (the functionality had previously been delivered for DB2 on Linux, UNIX, and Windows servers), and in that and the subsequent release the presence of a trigger defined with UPDATE OF COL5 on a table increased the CPU cost of any UPDATE statement targeting the table, even if the statement did not change data in column COL5.  That trigger cost was eliminated in DB2 for z/OS Version 8, so that the aforementioned trigger would affect the performance only of UPDATE statements that changed data in COL5.&lt;br /&gt;&lt;br /&gt;So, continuing with this example, &lt;span style="font-weight: bold;"&gt;how would the performance of a COL5-changing UPDATE statement be impacted by the UPDATE OF COL5 trigger?&lt;/span&gt;  That would depend, of course, on the nature of the triggered action (i.e., the SQL statement executed as a result of the trigger being "fired" by the UPDATE).  If the triggered action is an update of one row in one table, identified by a unique, indexed column referenced in a predicate, the impact of the trigger on the performance of COL5-changing UPDATE statements is likely to be minimal.  If, on the other hand, the triggered action were more involved (and keep in mind that it could be a call to a stored procedure), the affect of the trigger on COL5-changing UPDATE statements would be more noticeable.  The key here is to keep in mind that the action taken when a trigger is fired is &lt;span style="font-style: italic;"&gt;synchronous&lt;/span&gt; with respect to an SQL statement that causes the trigger to fire.  In other words, the trigger-firing SQL statement isn't finished until the triggered action is finished.  This means that there are performance implications for "downstream" triggers that might be fired as a result of the initial trigger being fired (a trigger defined with UPDATE OF COL5 on table ABC could drive an update of COL7 on table XYZ, and that triggered action would fire a trigger if one were defined with UPDATE OF COL7 on table XYZ).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Does this synchronous business mean that triggers with more complex triggered actions are a performance no-no?  Not necessarily.&lt;/span&gt;  One way to have that cake and eat it, too, is to have the trigger place information of interest (e.g., certain column values) on a WebSphere MQ queue (a trigger can certainly do this -  the triggered action has to be an SQL statement, and DB2 provides built-in functions, such as MQSEND, that can be used to send data to a designated MQ location).  Once that's done, the statement that fired the trigger can complete execution.  Asynchronously, with respect to the trigger-firing statement, the data sent to the MQ queue by the trigger can be processed as needed, perhaps by a DB2 stored procedure invoked by the MQ listener (the MQ listener function can automatically take an action, such as calling a stored procedure, when a message lands on a queue).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;How do triggers stack up, in terms of CPU efficiency, with other means of getting database action X accomplished as a result of action Y being performed?&lt;/span&gt;  Suppose that you have a need to capture "before" and "after" values in certain columns of table ABC when those columns are updated by programs.  If program PROG1 updates the columns of interest in table ABC, you could request that the program be modified to insert into table XYZ "before" and "after" values following the table ABC updates.  This approach might well be the most CPU-efficient way to address your requirement, but it could prove to be impractical for at least a couple of reasons.  For one thing, who would code the requested PROG1 functional enhancement?  Will that person -  likely engaged now in some other high-priority application development effort -  be available to change PROG1 to your liking within the next year?  Maybe not.  Then there's potential problem number two: what if the table ABC columns for which you want to capture changes are updated by multiple programs besides PROG1?  Are you going to try to get change-capture functionality added to all of those programs?  How long will that take?  You could opt to use a vendor tool to detect and capture changes made to the specified columns of table ABC, but if such a tool isn't currently part of your IT infrastructure, how long will it take to acquire it and how much will it cost?&lt;br /&gt;&lt;br /&gt;You could certainly determine that a trigger on table ABC defined with UPDATE OF [the columns of interest] would be the right way to go, offering &lt;span style="font-weight: bold;"&gt;a quickly implementable solution that would have a modest CPU cost and a very low dollar cost&lt;/span&gt; (or euro cost or whatever-currency cost ).  And, consider this: if programs that update the table ABC columns in which you are interested are so response-time sensitive that even adding a fairly simple trigger to the mix raises performance concerns, having that trigger defined on a data warehouse table (or operational data store table) to which table ABC changes are propagated might do the trick for you.&lt;br /&gt;&lt;br /&gt;Flexibility, agility, and economy - that's what DB2 triggers offer.  They should definitely be solution candidates when you have a need for timely implementation of incremental database application functionality.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-1393806336914889090?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/1393806336914889090/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/06/thoughts-on-db2-triggers.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1393806336914889090'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1393806336914889090'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/06/thoughts-on-db2-triggers.html' title='Thoughts on DB2 Triggers'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-1550060748939254841</id><published>2009-05-28T07:20:00.000-07:00</published><updated>2011-09-20T08:18:47.706-07:00</updated><title type='text'>Much Ado About DB2 Indexes (Part 2)</title><content type='html'>&lt;span style="font-family:arial;"&gt;Last week, &lt;a href="http://catterallconsulting.blogspot.com/2009/05/much-ado-about-db2-indexes-part-1.html"&gt;I posted an entry&lt;/a&gt; in which I described the numerous index-related enhancements delivered via DB2 9 for z/OS. In this related "part 2" entry, I'll cover new features of DB2 9.7 for LUW (Linux, UNIX, and Windows) - announced a few weeks ago and available next month - that pertain to indexes. My thanks go out to Matt Huras, Mike Winer, Chris Eaton, and Matthias Nicola of IBM's DB2 development organization, who recently delivered presentations on DB2 9.7 features that provided very useful information to me and others in the DB2 community.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;First up: index compression.&lt;/span&gt; DB2 9.7 for LUW index compression differs from the index compression feature of DB2 9 for z/OS in several ways.  For one thing, leaf pages of DB2 9.7 compressed indexes are compressed in memory as well as on disk, whereas DB2 9 for z/OS compression is disk-level only.  Secondly, while DB2 9 for z/OS compression is based on squeezing the contents of an 8K, 16K, or 32K index leaf page in memory onto a 4K page on disk, DB2 9.7 index compression is accomplished via three algorithms that are chosen automatically by the DBMS (these algorithms, which can be used in combination, are briefly described below). Additionally, DB2 9.7 index compression is activated automatically when row compression is activated for a table on which indexes are defined (DB2 9.7 index compression can also be activated independent of row compression by way of an ALTER INDEX or CREATE INDEX statement with the new COMPRESS YES option, with a REORG required after ALTER INDEX).&lt;br /&gt;&lt;br /&gt;I expect that index compression will prove to be very popular among users of DB2 9.7, especially in large-database environments, as it offers substantial disk space savings (likely to be in the range of 35% to 55%), better buffer pool hit ratios (with correspondingly reduced I/O activity), fewer page requests (because index leaf pages will hold more key values, and index levels may be reduced), and fewer index page splits. There will be some CPU overhead cost associated with index compression, but this should be offset to some degree by the aforementioned reductions in I/O activity and page access requests.&lt;br /&gt;&lt;br /&gt;Now, a little about the algorithms by which DB2 9.7 index compression is achieved (again, these are selected - and combined, if appropriate - automatically by DB2):&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;RID list compression.&lt;/span&gt; In an index leaf page, each entry contains a key value and list of RID (row ID) values, the latter indicating the location of rows containing the key value. A RID value will take up 4, 6, or 8 bytes of space, depending on the base table's tablespace type (e.g., LARGE or REGULAR, partitioned or non-partitioned).  For a LARGE non-partitioned tablespace, for example, a RID will occupy 6 bytes of space: 4 bytes for a page number and 2 for a slot number within the page. If an index on a table in that LARGE tablespace is compressed, and if, say, 10 rows within a given page contain a certain key value, the full RID value only has to be stored once in the RID list for those 10 rows. For the other 9 rows containing the key value, only the delta values between one row's RID value and the next have to be stored (RID values are always stored in ascending sequence). Because that delta value can be stored in as little as one byte of space, substantial savings can be achieved. RID list compression delivers maximum benefit for indexes that have relatively low cardinality (i.e., lots of duplicate key values and, therefore, relatively long RID lists) and a relatively high cluster ratio (making it likely that multiple rows with duplicate key values will be found on a given page).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;Prefix compression.&lt;/span&gt; Key values are stored in an index leaf page in ascending sequence. Sometimes, adjacent key values will be very similar (consider, for example, timestamp values that have year, month, day, and hour values in common; or a multi-column key for which leading columns have low cardinality). In such cases, DB2 9.7 can store the full key with the common prefix once in a page, with subsequent entries containing only the differentiated values that follow the common prefix.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;Slot directory compression.&lt;/span&gt; A certain amount of the space in an index leaf page is occupied by something called a slot directory. It used to be that the size of the slot directory - determined based on the maximum number of index entries that could be stored on the page - was fixed. For a compressed index, the size of the slot directory is variable and can be reduced based on factors such as common prefix entries, variable length key parts, and duplicate key values.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;Note that a new DB2 9.7 table function, ADMIN_GET_INDEX_COMPRESS_INFO(), can be used to obtain an estimate of the space savings that would result from activating compression for a given non-compressed index. This same function can be used to get the actual space savings for an index after it has been compressed.&lt;br /&gt;&lt;br /&gt;Note also that compression can't be used for all indexes in a DB2 9.7 environment.  Compression is not available for indexes on catalog tables, block indexes (these enable multi-dimensional clustering), XML path and meta indexes, and index specifications.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Next up, partitioned indexes.&lt;/span&gt; These are indexes, defined on a range-partitioned table, that are themselves partitioned along the lines of the underlying table's partitions. In comparison with global (i.e., non-partitioned) indexes on range-partitioned tables, partitioned indexes will allow for more efficient partition roll-in and roll-out operations (i.e., ATTACH and DETACH of partitions), as they eliminate the global index maintenance (and associated logging) that would otherwise be required. Partitioned indexes will also enable users to run REORG at the partition level.&lt;br /&gt;&lt;br /&gt;In a DB2 9.7 system, all indexes on range-partitioned tables will be created, by default, as partitioned indexes as long as this is possible. It is not possible for a unique index when the index key is not a superset of the underlying table's partitioning key.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;And, last but not least, DB2 9.7 delivers online creation of XML indexes and online REORG of same.&lt;/span&gt;  In both cases, use of the ALLOW WRITE ACCESS option will enable the XML index CREATE or REORG operation to proceed without blocking writers.&lt;br /&gt;&lt;br /&gt;So, for both the mainframe and Linux/UNIX/Windows platforms, IBM DB2 development keeps delivering good news on the index front. I expect more of the same in the future. &lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-1550060748939254841?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/1550060748939254841/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/05/much-ado-about-db2-indexes-part-2.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1550060748939254841'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/1550060748939254841'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/05/much-ado-about-db2-indexes-part-2.html' title='Much Ado About DB2 Indexes (Part 2)'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-7352575000056492110</id><published>2009-05-19T05:53:00.000-07:00</published><updated>2011-09-20T08:17:15.016-07:00</updated><title type='text'>Much Ado About DB2 Indexes (Part 1)</title><content type='html'>&lt;span style="font-family:arial;"&gt;I was in Denver last week for the North American Conference of the International DB2 Users Group (IDUG). More so than in recent years past, plenty of the talk during this conference was about index enhancements. Several important index-related features have come out over the past several DB2 releases, notable examples being online index reorganization for DB2 for z/OS and DB2 for Linux, UNIX, and Windows (LUW); block indexes (these enabled multi-dimensional clustering) with DB2 8.1 for LUW; and data-partitioned secondary indexes (aka DPSIs, referred to as "dipsies") with DB2 8 for z/OS. That said, the current versions of the DBMS - DB2 9 for z/OS and DB2 9.7 for LUW (announced last month and available in June) - pack in more index goodies than I've seen since type 2 indexes were delivered with DB2 for z/OS Version 4 in the mid-1990s. In this blog entry, I'll summarize what's new in the world of indexes with DB2 9 for z/OS.  Next week, I'll post a "part 2" entry that will describe index enhancements delivered with DB2 9.7 for LUW.&lt;br /&gt;&lt;br /&gt;&lt;span&gt;Here, then, is my list of DB2 9 for z/OS features that pertain to indexes (while not necessarily an exhaustive list, it's fairly comprehensive):  &lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Index-on-expression&lt;/span&gt;&lt;span style="font-style: italic; font-weight: bold;"&gt;:&lt;/span&gt; As I wrote in &lt;a href="http://catterallconsulting.blogspot.com/2009/05/db2-9-for-zos-rx-for-database-design.html"&gt;my previous blog entry&lt;/a&gt;, this enhancement provides quick relief to a headache-inducing situation with which many a DBA has dealt: you have a query with a predicate involving a column expression (e.g., &lt;/span&gt;&lt;span style="font-family:arial;"&gt;WHERE SUBSTR(COL1, 1, 2) = 'AB' &lt;/span&gt;&lt;span style="font-family:arial;"&gt;or WHERE COL1 + COL2 = 100).  The column expression makes the predicate non-indexable, and if there are no other indexable predicates in the query you're looking at a tablespace scan.  With DB2 9 for z/OS you can create an index on a column expression, enabling you to make previously non-indexable predicates indexable.  The potential payoff: orders-of-magnitude improvement in query performance.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Larger index page sizes:&lt;/span&gt;  DB2 users have long had a choice with respect to the size of a data page in a table.  4 KB and 32 KB page-size options have always been there, and 8 KB and 16 KB page sizes were added to the mix several years ago.  Indexes were a different story, with 4 KB being the only page size supported.  That changed with DB2 9 for z/OS and its support for 8KB, 16 KB, and 32 KB index page sizes.  Some might think of larger index page sizes only as a means of achieving index compression (see the next item in this list), but they can deliver benefits outside of compression enablement.  Consider index page splitting, which occurs when a key value has to be inserted into an already-full index leaf page (not uncommon when an index key is not a continuously-ascending value): a portion of the leaf page's entries (traditionally, half of the entries, but that's also changed with DB2 9, as you'll see when you read about "adaptive index page splitting" a little further down in this list) are moved to an empty page to make room for the new entry, and the whole index tree is latched while this occurs.  Larger index page sizes mean less index page splitting.  Another potential benefit of a larger index page size is a reduction in the number of levels for an index.  Suppose, for example, that an index with 4 KB pages has four levels: a root page that points to level-2 non-leaf pages, which in turn point to level-3 non-leaf pages, which point to the leaf pages.  That same index might require only three levels with a larger page size, and that would reduce the CPU cost of each index probe operation (from root-level down to leaf-level) by 25%.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Index compression:&lt;/span&gt; Mainframe DB2 users have enjoyed the benefits of tablespace compression since Version 3 (if memory serves me correctly).  For some DB2 subsystems, particularly in data warehouse environments, in which the average number of indexes defined on a table tends to be higher than in online transactional systems, the disk space occupied by indexes can exceed the amount used for tablespaces (especially if the latter are compressed, as is very commonly the case).  With DB2 9, index compression is an option.  It's different from tablespace compression in that 1) it's not dictionary-based and 2) the compression is only at the disk level (index pages are uncompressed in server memory).  To be compressed, an index has to use a page size greater than 4 KB (for existing indexes, this can be accomplished via an ALTER INDEX followed by a REBUILD -  and go down further in the list to see a REBUILD INDEX enhancement).  DB2 then takes that 8 KB, 16 KB, or 32 KB index leaf page (only leaf-level pages are compressed, but the vast majority of an index pages are on this level) and compresses the contents onto a 4 KB page on disk.  You might be tempted to think that a 32 KB index page size if best for compression purposes, but you have to keep in mind that DB2 will stop putting entries in a leaf page in memory once it has determined that no more will fit onto the compressed 4 KB version of the page on disk; thus, the aim is to strike a balance between maximizing space savings on disk and minimizing wasted space on index pages in memory.  Fortunately, the DSN1COMP utility provided with DB2 9 will give you information that will help you to choose the optimum page size for an index that you want to compress.&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Adaptive index page splitting:&lt;/span&gt;  As previously mentioned (see the "larger index page sizes" item in this list), when an index leaf page must be split in order to accommodate a new entry, DB2 for z/OS will -  before DB2 9 -  move half of the page's entries to an empty page.  That was OK &lt;span style="font-style: italic;"&gt;unless entries were inserted in a sequential fashion within ranges&lt;/span&gt;.  For example, suppose that an index is defined on a column that contains ascending values within the ranges of A001 to A999, B001 to B999, C001 to C999, and so forth.  If a leaf page with the highest Annn value -  say, A227 -  is full and must be split to accommodate a new Annn entry (e.g., A228), half the entries in that page will be moved to a new page.  Trouble is, the resultant 50% free space on one of those two pages (the one that does not contain the new highest value in the Annn range) will not be reused because nothing lower than A228 (using my example) will be added to the index (more precisely, that space won't be reused until the index is reorganized).  DB2 9 improves on that situation by tracking value-insert pattern for an index.  If it detects a sequential-within-range pattern (versus continuously-ascending overall, such as a timestamp or sequence number, in which case no splits will occur because new entries will always be at the "end" of the index), it will change the split process so that fewer than 50% of the split page's entries will be moved to the new page (or, if the insert pattern is descending within ranges, more than 50% of the split page's entries will be moved to the new page).  The result: fewer page splits, leading to reduced CPU and elapsed time for application processes.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Online index rebuild:&lt;/span&gt; What I'm specifically referring to here is the introduction in DB2 9 for z/OS of a SHRLEVEL CHANGE option for the REBUILD INDEX utility.  Formerly, an index rebuild operation would cause the underlying table to be in a read-only state for the duration of the rebuild process.  Now, a table can be updated while a REBUILD INDEX operation is underway -  DB2 deals with these data-changing operations by using the associated log records to apply the corresponding changes in an iterative fashion as needed to the index being rebuilt (during a final "catch-up" phase of this log apply processing, write activity against the underlying table is drained, as is the case for an online REORG running with SHRLEVEL CHANGE).  This utility enhancement is good news for organizations (and there are many) at which new indexes on existing tables are commonly created with the DEFER YES option with a follow-on execution of REBUILD INDEX to physically build the index, and it means better data accessibility when REBUILD INDEX is run for an index in rebuild-pending status.  Note, however, that if REBUILD INDEX is run with SHRLEVEL CHANGE for a &lt;span style="font-style: italic;"&gt;unique&lt;/span&gt; index, inserts and updates (if the latter target a column of the unique index key) will not be allowed for the underlying table, because uniqueness cannot be enforced while the index is being rebuilt.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;RENAME INDEX:&lt;/span&gt;  Online schema evolution -  the process by which DB2, in succeeding releases, allows more and more database object change operations to be performed without the need for a drop and re-create of the target object -  marches on.  In DB2 9 for z/OS, the functionality of the RENAME statement has been extended to include indexes.  Note that renaming an index will not cause invalidation of packages (or of DBRMs bound directly into plans), because static SQL statements reference indexes by their object identifier (aka OBID), not by name.  Prepared dynamic SQL statements in the dynamic statement cache, on the other hand, reference indexes by  name, so those that use a renamed index will be invalidated (they'll of course be re-prepared and re-cached at the next execution following invalidation).&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Greater leveraging of index lookaside:&lt;/span&gt;  Index lookaside, a feature that allows DB2 to repeatedly access entries on a leaf page (and on the associated parent non-leaf page) without having to do a full index probe (root-to-leaf level transit of the index tree) each time, was introduced way back in Version 2 Release 3.  It greatly reduced GETPAGEs (and thus, CPU time) for many application processes that used a file of search values sorted according to an indexed table column to retrieve DB2 data.  In DB2 for z/OS Version 8, the use of index lookaside was finally extended to data-changing processes, but only for INSERT, and only for the clustering index on a table.  With Version 9, DB2 can use index lookaside for INSERT operations with indexes other than the clustering index (assuming that these indexes have an ascending -  or, I believe, descending -  key sequence), and can also use index lookaside for DELETE operations.  IBM performance guru Akira Shibamiya noted in a presentation given at last year's &lt;a href="http://www.idug.org/"&gt;IDUG&lt;/a&gt; North American Conference that a test involving heavy insert into a table with three ascending-key indexes showed a reduction in average GETPAGEs per INSERT to 2 in a DB2 9 environment versus 12 in a DB2 for z/OS Version 8 system.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Closing the DPSI performance gap:&lt;/span&gt; Data-partitioned secondary indexes (DPSIs), introduced with DB2 for z/OS V8, are indexes over range-partitioned tables (referring to table-controlled versus index-controlled partitioned tablespaces) that are themselves partitioned in accordance with the partitioning scheme of the underlying table.  DPSIs are nice for improving performance and availability with respect to some partition-level utilities and for FIRST TO LAST partition-rotation operations, but restrictions on their use for SQL statement access path purposes meant that DPSIs had a "performance gap" versus non-partitioned indexes.  In the DB2 9 environment, this gap is made considerably smaller, thanks to these enhancements:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;Enhanced page-range screening:&lt;/span&gt;  Page-range screening refers to DB2's ability to avoid accessing table or index partitions in the course of executing an SQL statement, when it determines based on one or more predicates that qualifying rows or index entries cannot possibly be located within said partitions.  Page-range screening can have a VERY beneficial impact on query performance, and in the Version 9 environment DB2 can apply page-range screening more broadly to DPSIs.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;More parallelism:&lt;/span&gt;  There are more situations in a Version 9 system in which DB2 can parallelize data retrieval for a statement that uses a DPSI for data access.&lt;/span&gt;&lt;/li&gt;&lt;li style="font-style: italic;"&gt;&lt;span style="font-family:arial;"&gt;A DPSI can provide index-only access for a SELECT statement with an ORDER BY clause.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-style: italic;"&gt;A DPSI can be defined as UNIQUE in a DB2 9 environment, if the DPSI key columns are a super-set of the table's partitioning columns.&lt;/span&gt;&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;And a few more:&lt;/span&gt; Just to wrap up with a few quickies:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;DB2 9 can use a non-unique index to avoid a sort for a SELECT DISTINCT statement.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;The calculation of CLUSTERRATIO by the DB2 9 RUNSTATS utility provides the optimizer with a more accurate indication of a table's "clusteredness" with respect to an index, particularly if the indexed key has a significant number of duplicate values.  This can enable the optimizer to make better decisions regarding the use of such indexes (if desired, the old CLUSTERRATIO calculation can be retained through the ZPARM parameter STATCLUS).&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;The LASTUSED column in the real-time statistics table SYSINDEXSPACESTATS (part of the DB2 9 catalog) shows the last time that an index was used for data access (e.g., for SELECT, FETCH, searched UPDATE, or searched DELETE) or to enforce a referential integrity constraint.  This should be VERY helpful when it comes to identifying indexes that are no longer used and which therefore would be candidates for dropping in order to reduce disk-space consumption and CPU costs for inserts and deletes and utilities.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;That's a lot of good index stuff.  As I mentioned up top, tune in next week for a look at some cool index-related enhancements delivered in DB2 9.7 for Linux, UNIX, and Windows.&lt;/span&gt;&lt;span style="font-style: italic;"&gt; &lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-7352575000056492110?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/7352575000056492110/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/05/much-ado-about-db2-indexes-part-1.html#comment-form' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/7352575000056492110'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/7352575000056492110'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/05/much-ado-about-db2-indexes-part-1.html' title='Much Ado About DB2 Indexes (Part 1)'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-2600292049916232443</id><published>2009-05-13T15:57:00.000-07:00</published><updated>2011-09-20T08:14:01.789-07:00</updated><title type='text'>In Denver: Mainframes, DB2, COBOL, and SOA</title><content type='html'>&lt;span style="font-family:arial;"&gt;I'm in Denver, Colorado this week for the 2009 North American Conference of the International DB2 Users Group (IDUG). Yesterday, I moderated a Special Interest Group session (also known as a SIG - basically, a "birds of a feather" discussion group) on the topic of "mainframes, DB2, COBOL, and SOA." The conversation was interesting and lively, and I'll summarize it for you by way of this post, with key discussion threads highlighted.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Is Service-Oriented Architecture still relevant?&lt;/span&gt;&lt;span&gt;  Absolutely.  Sure, the buzz around SOA has diminished lately, thanks to more recent arrivals on the "You've GOT to get into this!" scene (see: cloud computing), but that's probably a good thing, as SOA was being hyped to the point that unrealistic expectations were leading to disappointing results (see my &lt;a href="http://catterallconsulting.blogspot.com/2009/03/soa-hits-speed-bump-but-not-dead-end.html"&gt;recent blog entry on this subject&lt;/a&gt;).  With SOA frenzy in abatement, organizations can get about the work of implementing SOA initiatives that have been well researched, properly scoped, and properly provisioned (referring to having the right tools on hand).  One of the participants in our SIG session mentioned that SOA is a very high priority at his organization, a large department of the United States federal government (at which point two other SIG participants, Susan Lawson and Dan Luksetich of  consulting firm &lt;a href="http://www.ylassoc.com/index.php?page=home"&gt;YLA&lt;/a&gt;, spoke up about the SOA work that they've been doing at another large U.S. government department).  And SOA is not just a big deal in public sector circles -  we had people in our SIG group from a wide variety of industries.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;What does COBOL have to do with it?&lt;/span&gt;  The SIG was titled as it was because many people have this idea that SOA requires the use of "modern" programming languages such as Java, C#, Ruby, and Python.  That, of course, is totally untrue.  COBOL programs can be a very important part of an SOA-compliant application.  One thing that's happening in a lot of mainframe DB2 shops is the implementation of data access logic in the form of COBOL stored procedures that are called by business-tier programs running in off-mainframe app servers such as WebSphere.  As we discussed this particular subject, one participant noted that his company has hardly any COBOL programmers.  No problem.  DB2 for z/OS stored procedures, which are very well suited to the data tier of an SOA, can be written in a variety of languages, including SQL (I'm particularly bullish on DB2 for z/OS V9 native SQL procedures, &lt;a href="http://catterallconsulting.blogspot.com/2008/11/db2-9-for-zos-stored-procedure-game.html"&gt;about which I blogged&lt;/a&gt; a few weeks ago).  Organizations are also exposing CICS-DB2 and IMS-DB2 transaction programs, written in COBOL, as Web services.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Don't get too caught up in the technology behind an SOA.&lt;/span&gt;  One of our SIG participants made the very important point that a successful SOA project has more to do with process and governance (and, some would say, with cultural change) than with technology.  There are all kinds of options with regard to tools, languages, platforms, and protocols, but getting SOA right depends largely on changing the way an IT organization works: more discipline, more standards, better business-IT alignment.  Because change makes a lot of people uneasy, executive-level support is usually critical to the achievement of a positive SOA outcome.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Give plenty of thought to service granularity.&lt;/span&gt;&lt;span&gt;  Susan and Dan of YLA talked of a client company that ran into major performance problems with an SOA-oriented application, with the key factor being an inordinately high number of calls to the back-end DB2 database.  Sometimes, a situation of that type can result when the services provided by application programs are too fine-grained.  More coarsely-grained services can allow for greater back-end efficiencies, but they can also reduce flexibility when it comes to reusing blocks of code to build new services.  There's no one-size-fits-all solution when it comes to determining the granularity of services that an SOA-compliant application should provide, but it's probably a good idea to avoid the extremes at either end of the spectrum.  An application architect friend of mine liked to put it this way: "What do people [meaning the folks who write service-consuming programs] want?  Do they want water, or do they want to be able to get an atom of oxygen and a couple of atoms of hydrogen?"  The right answer is the one that makes sense in your environment.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Can you do SOA and still have good end-user response time?&lt;/span&gt;  With its emphasis on abstraction and loose coupling of application system components (the better to achieve agility with respect to extending application capabilities, and flexibility in terms of mixing and matching computing platforms at various application tier levels), SOA tends to increase an application's instructional path length (meaning, the app will consume more CPU cycles than one architected along more monolithic lines).  Thus, going the SOA route could lead to elongated end-user response times.  This performance hit can be mitigated through several means, one being the use of message queuing software (such as IBM's WebSphere MQ, formerly known as MQSeries) to de-couple back-end database processing from the front-end response to the user (in other words, make back-end processing asynchronous from the perspective of the end user).  Another SOA performance-boosting technique involves the use of cache servers to speed response for certain data-retrieval requests (you can read more about the use of message queues and cache servers to enhance SOA application performance in &lt;a href="http://www.ibmdatabasemag.com/story/showArticle.jhtml?articleID=207801123"&gt;an article I wrote on the topic&lt;/a&gt; for &lt;span style="font-style: italic;"&gt;IBM Database Magazine&lt;/span&gt;).&lt;br /&gt;&lt;br /&gt;Dan Luksetich talked up another option for improving the performance of an SOA application: drive multitasking.  If the back-end processing associated with a transaction involves the execution of, say, three discrete tasks, see if you can kick off three processes that can do the required work in parallel.  This is where enterprise service bus (ESB) and workflow orchestration software (sometimes referred to as a "process engine") can really come in handy (read more about this in my IBM Database Magazine column titled &lt;a href="http://www.ibmdatabasemag.com/story/showArticle.jhtml?articleID=209900050"&gt;"Get on the Enterprise Service Bus"&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;SOA can impact database design as well as application design.&lt;/span&gt;  Often, an SOA project will result in a DB2 database design that is more highly normalized versus a database designed for an application with a monolithic architecture.  This has to do with the goal of loose coupling (i.e., dependency reduction) that is a key aspect of an SOA.  What you want is a database design that is driven by the nature of the data in the database, as opposed to a design that is aimed at optimizing the performance of a particular application (the latter approach sounds good until you start thinking about other applications that could be built on the same database foundation -  it can be to an organization's advantage to trade some database processing efficiency for improved flexibility).&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;SOA can be an impetus for database consolidation on a mainframe server.&lt;span style="font-weight: bold;"&gt;  &lt;/span&gt;&lt;/span&gt;As previously mentioned, an important aspect of SOA is abstraction of one application system tier (e.g., the data layer) from another (such as the business layer).  Another key characteristic of an SOA is standardization with respect to interactions between programs running in different tiers of the application system.  Once this abstraction and standardization has been achieved, the platform on which data and data access logic resides should not be a concern to a business-logic programmer.  The data server of choice should be the one that can deliver the scalability, availability, and security needed by the organization, and a mainframe (or parallel sysplex mainframe cluster) running DB2 for z/OS is not going to be beat on that score.  Indeed, several of the SIG participants spoke of the momentum behind consolidation of databases from distributed systems servers to mainframes that is due in part to the progress of SOA implementation efforts.&lt;br /&gt;&lt;br /&gt;That's pretty cool: the mainframe, referred to by some as a "legacy" (read: old-fashioned) server platform, is shining anew as a primo foundation for leading-edge enterprise applications designed in accordance with SOA principles.  A very satisfying SIG, indeed.&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-2600292049916232443?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/2600292049916232443/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/05/in-denver-mainframes-db2-cobol-and-soa.html#comment-form' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2600292049916232443'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/2600292049916232443'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/05/in-denver-mainframes-db2-cobol-and-soa.html' title='In Denver: Mainframes, DB2, COBOL, and SOA'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-849992035412466762</id><published>2009-05-08T13:21:00.000-07:00</published><updated>2009-05-08T20:13:29.988-07:00</updated><title type='text'>DB2 9 for z/OS: Rx for Database Design Headaches</title><content type='html'>&lt;span style="font-family: arial;"&gt;I spent most of this past week teaching a DB2 9 for z/OS Transition class.  In such a situation, it's fun to get other people's take on the new features delivered with this latest release of DB2 on the mainframe platform.  One of the students had an interesting comment regarding the ability to create an index on a &lt;span style="font-style: italic;"&gt;key expression&lt;/span&gt; (that being an expression that reference's at least one of a table's columns and which returns a scalar value): "This is going to help me deal with some application performance problems that are database design-related."  He went on to tell a tale familiar to many experienced DB2 DBAs: an application was migrated to DB2 from a non-relational database management system, but in moving the data there was no attempt to redesign the database to take advantage of relational technology.  Instead, the records in the legacy database files were just plopped into DB2 tables that were designed according to the record layouts of the old system.  Among other things, this led to a number of situations in which predicates of application program queries had to be coded with scalar functions in order to generate the required result sets.  In particular, there were plenty of predicates of the form:&lt;br /&gt;&lt;br /&gt;WHERE SUBSTR(COL, &lt;span style="font-style: italic;"&gt;start, length&lt;/span&gt;) = &lt;span style="font-style: italic;"&gt;'some value'&lt;span style="font-style: italic;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/span&gt;&lt;/span&gt;Oops.  That's a non-indexable predicate in a DB2 V8 environment.  What are you going to do about that?  Ask end users to endure tablespace scans?  Store the predicate-referenced data twice in the target table -  once in its original column form and again in a column that contains the desired substring information?  That second option's no fun: more disk space consumption, and required modification of inserting programs and programs that update the column in question (or creation of triggers to maintain the "substring" column, knowing that the triggered actions will increase overhead for SQL statements that cause a trigger to fire).&lt;br /&gt;&lt;br /&gt;Enter DB2 9.  When you're running this DB2 release in New Function Mode, you can create an index on an expression.  That index might look like this:&lt;br /&gt;&lt;br /&gt;CREATE INDEX SUBSTRIX&lt;br /&gt;ON TABLE_ABC&lt;br /&gt;(SUBSTR(COLn, 2, 5))&lt;br /&gt;USING STOGROUP XYZ&lt;br /&gt;  PRIQTY 64&lt;br /&gt;  SECQTY 64&lt;br /&gt;BUFFERPOOL BP1;&lt;br /&gt;&lt;br /&gt;And guess what?  When a query comes along with a predicate of the form:&lt;br /&gt;&lt;br /&gt;WHERE SUBSTR(COLn, 2, 5) = &lt;span style="font-style: italic;"&gt;'some value'&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;That predicate is now stage 1 and indexable.  Now, there's no free lunch.  Indexes defined on expressions will cause overhead to increase somewhat for insert operations and for updates that would change an expression-generated key value.  Some utilities, such as LOAD and REBUILD INDEX, will also end up consuming a little more CPU time.  Still, with the potential for orders-of-magnitude performance improvement for some queries containing predicates that had previously been non-indexable, the index-on-expression feature of DB2 9 for z/OS is, I think, going to end up being a big draw for many shops.&lt;br /&gt;&lt;br /&gt;Another DB2 9 feature that can help ease the pain of database design-related problems is the new "instead-of" trigger category (added to the existing UPDATE, INSERT and DELETE trigger types). You may have a situation in which a view has been created to make life easier for programmers who have to code SELECTs for certain data retrieval operations.  Trouble is, that view might be read-only -  it might, for example, be based on a join of two or more tables.  What then?  Do you tell the programmers that they should target the view for reads, and the underlying tables for data-change operations?  That would certainly fly in the face of the "make life easier" rationale behind the creation of the view.  Should you dispense with the view and denormalize the database design to provide a single-table SELECT-result that matches what one can get from the view?  Sure, if you want to increase disk space consumption, change update/insert/delete SQL statements accordingly, and decrease flexibility with respect to the design of future applications that might need to access the database.&lt;br /&gt;&lt;br /&gt;A better solution: go to DB2 9 (if you're not already there), get to New Function Mode, and define INSTEAD OF triggers that will enable programmers to both read from, and change data in, the view that had formerly been read-only.&lt;br /&gt;&lt;br /&gt;Here's an illustrative example of what I'm talking about: suppose you have an EMPLOYEE table and a DEPARTMENT table.  They both have a DEPTNO column, but only the DEPARTMENT table contains department names (as should be the case for a third-normal-form database design).  If you want to make it really easy for programmers to retrieve department names along with department numbers for employees, you can create a view based on a join of the EMPLOYEE and DEPARTMENT tables; however, the resulting view would be read-only absent an INSTEAD OF UPDATE trigger on the view.&lt;br /&gt;&lt;br /&gt;Here's what I mean:&lt;br /&gt;&lt;br /&gt;CREATE VIEW EMP_DEPT&lt;br /&gt;  AS SELECT E.EMPNO, E.FIRSTNME, E.LASTNAME, E.DEPTNO, D.DEPTNAME&lt;br /&gt;  FROM EMPLOYEE E, DEPTARTMENT D&lt;br /&gt;  WHERE E.DEPTNO = D.DEPTNO;&lt;br /&gt;&lt;br /&gt;If you subsequently issue the following SQL statement to change the first name of employee 000100 to "CHUCK" (perhaps from "CHARLES")&lt;br /&gt;&lt;br /&gt;UPDATE EMP_DEPT&lt;br /&gt;SET FIRSTNME = 'CHUCK'&lt;br /&gt;WHERE EMPNO = '123789';&lt;br /&gt;&lt;br /&gt;You'll get a -151 SQL error code because the view EMP_DEPT is read-only (thanks to the fact that the SELECT defining the view has more than one table in the FROM-list.&lt;br /&gt;&lt;br /&gt;You can take care of this problem with an INSTEAD OF UPDATE trigger like this one:&lt;br /&gt;&lt;br /&gt;CREATE TRIGGER OK_UPDTE INSTEAD OF UPDATE&lt;br /&gt;  ON EMP_DEPT&lt;br /&gt;  REFERENCING NEW AS N OLD AS O&lt;br /&gt;  FOR EACH ROW MODE DB2SQL&lt;br /&gt;  UPDATE EMPLOYEE E&lt;br /&gt;  SET E.EMPNO = N.EMPNO,&lt;br /&gt;    E.FIRSTNME = N.FIRSTNME,&lt;br /&gt;    E.LASTNAME = N.LASTNAME,&lt;br /&gt;    E.DEPTNO = N.DEPTNO&lt;br /&gt;    WHERE E.EMPNO = O.EMPNO;&lt;br /&gt;&lt;br /&gt;With this trigger in place, the update statement above that previously got the -151 SQL code will execute successfully.  Ta-da!  Now the programmers won't have to reference different objects in their SELECT and UPDATE SQL statements -  they'll just target the EMP_DEPT view in either case.  Problem solved, with no need to change the underlying database design.&lt;br /&gt;&lt;br /&gt;So, if you find yourself dealing with predicates that are non-indexable because the database design necessitates the coding of column scalar functions in the predicates, you have a very attractive remedy in the form of DB2 9 indexes on expressions.  Similarly, if you want to put views on top of a database design to make some data-retrieval operations easier, and you don't want to have to direct programmers to the underlying tables (versus the views) for data-change operations, DB2 9 INSTEAD OF triggers may be just what the doctor ordered.  Look for opportunities to put these features to work in your shop. &lt;span style="font-style: italic;"&gt;&lt;span style="font-style: italic;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-849992035412466762?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/849992035412466762/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/05/db2-9-for-zos-rx-for-database-design.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/849992035412466762'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/849992035412466762'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/05/db2-9-for-zos-rx-for-database-design.html' title='DB2 9 for z/OS: Rx for Database Design Headaches'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-5104062487396502020</id><published>2009-04-21T06:38:00.000-07:00</published><updated>2011-09-20T08:09:40.781-07:00</updated><title type='text'>DB2 for z/OS Prefetch, Part 3 (DB2 9 Enhancements)</title><content type='html'>&lt;span style="font-family:arial;"&gt;Prefetch (the basics of which I described in &lt;a href="http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-1-basics.html"&gt;part 1&lt;/a&gt; of this 3-part blog entry) has been a feature of DB2 for z/OS for as long as I can remember (I first got my hands on the DBMS around 1987, when Version 1 Release 2 was current). Over the years, prefetch has been enhanced in various ways through succeeding releases of DB2, and that process has continued with Version 9. In this post, I will cover three prefetch-related DB2 9 enhancements: an increase in the prefetch quantity, a new formula for calculating the cluster ratio of an index relative to the underlying table (along with a new and related catalog statistic), and a change in the DB2 optimizer's "thinking" with respect to dynamic prefetch versus sequential prefetch (dynamic and sequential prefetch - along with list prefetch - are described in the aforementioned &lt;a href="http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-1-basics.html"&gt;part 1&lt;/a&gt; of this 3-part entry on prefetch).&lt;br /&gt;&lt;br /&gt;OK, prefetch quantity: for years, we've thought of a 32-page prefetch quantity for SQL access to tablespaces and indexes defined with a 4 KB page size, and a 64-page quantity for utility access to such objects. With DB2 9, the prefetch quantity for SQL access goes to 64 pages for sequential prefetch and LOB-related list prefetch (it's still 32 pages for dynamic prefetch and non-LOB list prefetch) for tablespaces and indexes assigned to 4 KB buffer pools for which VPSIZE (the number of buffers allocated to the pool) times VPSEQT (the sequential steal threshold, which defaults to 80% and is used to limit the number of buffers that can be occupied by pages read from disk in a sequential pattern) is greater than or equal to 40,000. For utility access to objects assigned to 4 KB buffer pools, the prefetch quantity in DB2 9 environment is 128 pages when VPSIZE times VPSEQT is greater than or equal to 80,000. The increase in the prefetch quantity should improve response times for application-related tablespace scans and for many utility operations. You can find information about the V9 prefetch quantities for SQL and utility access, for all buffer pool page sizes (4 KB, 8 KB, 16 KB, and 32 KB), in Table 117 in the DB2 V9 for z/OS &lt;span style="font-style: italic;"&gt;Performance Monitoring and Tuning Guide&lt;/span&gt; (you can find this and other DB2 9 manuals in &lt;a href="http://www-01.ibm.com/support/docview.wss?rs=64&amp;amp;uid=swg27011656"&gt;the DB2 9 "bookshelf"&lt;/a&gt; on IBM's Web site).&lt;br /&gt;&lt;br /&gt;Next, the new cluster ratio formula. You probably know that CLUSTERRATIO, a column in the SYSINDEXES catalog table, indicates, in the form of a percentage, the degree to which the physical order of a table's rows align with the order of the index's key values (the CLUSTERRATIOF column in SYSINDEXES expresses the same value as a floating-point number between 0 and 1). CLUSTERRATIO influences the DB2 optimizer's access path selection process, particularly as it pertains to prefetch access. With DB2 9, the formula for calculating the cluster ratio of an index has been enhanced to improve the accuracy and usefulness of the calculated value. The principal sources of improved calculation are:&lt;br /&gt;&lt;/span&gt;&lt;ol&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;The formula now considers all of the RIDs (row IDs) in an index, as opposed to considering only succeeding key values. The old formula made the cluster ratio for indexes with lots of duplicate key values (e.g., an index on department number on a large employee table) smaller than it should have (keep in mind that RIDs for duplicate key values are stored in an index in ascending RID sequence - in other words, in physical sequential order).&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;The old formula counted a table row as being clustered if the next key value resided on the same page or a forward page with respect to the last RID of the current key value. Problem is, "forward" could mean 1000 pages forward, and that's really not useful from a prefetch perspective. With the new formula, a "forward" page, in terms of satisfying the "clustered" test, has to be within a sliding window that is based on the prefetch quantity and the size of the buffer pool to which the underlying table is assigned.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;The new formula considers the "clustered-ness" of an index for both forward &lt;span style="font-style: italic;"&gt;and backward&lt;/span&gt; scans, whereas the old formula considered only forward sequential access.&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;span style="font-family:arial;"&gt;So, in going from V8 to V9 (and after running the RUNSTATS utility for objects in the DB2 9 environment, to update CLUSTERRATIO values), you might see some custer ratio figures increase (perhaps dramatically, for some indexes with lost of duplicate key values) and some decrease (for tables that were clustered relative to an index only if you considered large "skip-ahead" page distances from one key value to the next as being OK, as was the case with the old formula). The new cluster ratio values (which will require package rebinds if you want them to influence static SQL access paths) should improve performance in DB2 9 systems; however, on the off chance that the new values have a negative impact on performance, you can direct DB2 to go back to the old formula by setting the value of the ZPARM parameter STATCLUS to STANDARD.&lt;br /&gt;&lt;br /&gt;Also, note that DB2 9 introduces a new catalog statistic, DATAREPEATFACTORF (in SYSINDEXES and related catalog tables), that gives the optimizer a much better feel for the "density" of an index's ascending key values with respect to the sequential arrangement of rows in the pages of the underlying table. In other words, are rows in a table sequential and "dense" with respect to the index, or are they sequential and not dense? This information helps the optimizer to further refine access path selection decision-making, particularly with respect to prefetch.&lt;br /&gt;&lt;br /&gt;Finally, a bit on the optimizer's prefetch preference in a DB2 9 environment. What you are likely to see if you migrate to V9 from V8 is an increase in dynamic prefetch activity and a decrease in sequential prefetch activity - in fact, you may see sequential prefetch used only for tablespace scans. The leaning towards dynamic prefetch makes sense: it's "smarter" (it can be temporarily "turned off" during a scan, if the page access pattern becomes non-sequential in terms of overly large "skip ahead" or "skip behind" distances from page to page, and it doesn't require access to particular "trigger pages" to keep prefetch going) and it can work for backward as well as forward scans (sequential prefetch is forward-only). That said, you may think of the dynamic prefetch preference as being an example of the optimizer "punting" with respect to prefetch decisions (i.e., "I'm just going to let that be a run-time decision versus a bind-time decision"). You'd be wrong in so thinking. What you should expect in a DB2 9 system is an increased incidence of the value 'D' (for dynamic prefetch) in the PREFETCH column in EXPLAIN tables. That means that the optimizer &lt;span style="font-style: italic;"&gt;expects&lt;/span&gt; that dynamic prefetch will be utilized in accessing data or index pages for a query, and will optimize the statement accordingly.&lt;br /&gt;&lt;br /&gt;That leads me to the bottom line: DB2 prefetch, an oldie-but-goodie product feature, is better than ever in the DB2 9 environment. DB2 9 will use prefetch more effectively and efficiently, and the result should be improved application (and utility) performance.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-5104062487396502020?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/5104062487396502020/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-3-db2-9.html#comment-form' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/5104062487396502020'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/5104062487396502020'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-3-db2-9.html' title='DB2 for z/OS Prefetch, Part 3 (DB2 9 Enhancements)'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-5717488170966698553</id><published>2009-04-16T08:17:00.000-07:00</published><updated>2011-09-20T08:06:33.377-07:00</updated><title type='text'>DB2 for z/OS Prefetch, Part 2 (Monitoring and Tuning)</title><content type='html'>&lt;span style="font-family:arial;"&gt;A few days ago, I posted &lt;a href="http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-1-basics.html"&gt;part 1&lt;/a&gt; of a 3-part entry on prefetch in a DB2 for z/OS environment. That post covered the basics: the different types of prefetch and how they work. In this part 2 entry, I'll cover monitoring and tuning in relation to prefetch. I'll provide information in the form of questions and answers, as I've dealt with a number of such queries over the years.  At the beginning of next week, I'll post part 3, which will describe some prefetch-related changes introduced with DB2 for z/OS Version 9. On, now, to the Q's and A's.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Q1: Where does the CPU time associated with prefetch activity get charged?&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;A1: Allied address spaces (those through which SQL statements get to DB2, examples being a CICS application-owning region, a batch address space, and - for statements sent from remote clients via the DRDA protocol - the DB2 Distributed Data Facility address space) get charged for the CPU time associated with synchronous reads (i.e., single-page, on-demand read I/Os), but for prefetch reads, which are asynchronous, the associated CPU time is charged to the DB2 Database Services Address space (also known as DBM1). In fact, in my experience I've seen that the bulk of DBM1 CPU time is related to prefetch reads and to database write I/Os. When DB2-using organizations take steps that lead to reduced prefetch I/O activity (such as making buffer pools larger - a prefetch request for 32 data or index pages does not lead to an I/O operation if all of the 32 pages are already in memory), they tend to see decreases in DBM1 CPU consumption. And note that we are talking about consumption of general-purpose CPU cycles here. It's true that mainframes have specialized processors that assist with I/O operations, but these reduce - as opposed to eliminate - the general-purpose cycles related to I/Os.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Q2: Where can one get information about prefetch activity?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A2: I tend to rely on two sources: a DB2 monitor product (these are available from IBM and from several 3rd-party software vendors) and the DB2 command -DISPLAY BUFFERPOOL DETAIL. With respect to the DB2 monitor product, I like to look over statistics and accounting detail reports (with "detail" referred to as "long" for some monitors) - the former provide a look at overall activity for a DB2 subsystem, and the latter let you restrict the view to activity related to (for example) a given connection type (e.g., CICS or DRDA), a given connection ID (e.g., a CICS region), or a given so-called correlation ID (this could be a batch job name or a CICS transaction ID). In either case, look at the buffer pool activity section of the report (if this activity is broken out by type, look at the read activity, and note that you'll see, certainly in a statistics report, information broken out by buffer pool). With regard to the CPU cost of prefetch and it's impact on DBM1 address space CPU consumption (as mentioned in the answer to question 1 above), check the DB2 address space CPU times in the statistics report.  A statistics or accounting report can be set up to capture activity occurring in a given time period.  I usually want to see data for a one- to two-hour period of "busy time" on a system, as opposed to a 24-hour period.&lt;br /&gt;&lt;br /&gt;While I like the reports generated by DB2 monitor products, you can also get useful real-time information using the online monitor interface.  Just look for the information related to buffer pools for overall numbers, and for "thread detail" information if you want to look at I/O activity for a particular application process.&lt;br /&gt;&lt;br /&gt;The -DISPLAY BUFFERPOOL DETAIL command is a handy way to quickly generate very useful buffer pool-related information, including prefetch numbers. You can issue the command so as to get information for a single buffer pool, a list of buffer pools, all active buffer pools, or all pools (active and inactive).  Because I like to look at an hour's worth of data, I usually issue the command with the INTERVAL option, wait an hour, and issue it again (also with INTERVAL specified); otherwise, you'll get information for the period of time since the pool (or pools) was last activated, and that could be days or weeks ago.&lt;br /&gt;&lt;br /&gt;Whether using monitor reports (or screens) or -DISPLAY BUFFERPOOL DETAIL command output, keep in mind the important distinction between prefetch requests and prefetch I/Os. As previously noted, a prefetch request results in an I/O operation only if one or more of the pages requested are not already in memory. Generally speaking, your focus should be on prefetch I/Os.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Q3: Should one pay attention to prefetch I/Os when it comes to buffer pool sizing?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A3: YES! Often, people engaged in buffer pool analysis zero in on synchronous read activity. Synchronous reads are indeed important, and it's good to reduce that activity if you want to improve the performance of DB2-accessing applications, but prefetch I/Os matter, too, for two important reasons: first (reiterating yet again), they consume CPU cycles. Second, although they are asynchronous and anticipatory (as pointed out in my &lt;a href="http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-1-basics.html"&gt;part 1 prefetch entry&lt;/a&gt;) and   ideally aimed at getting pages into memory before they are requested by an application process, programs can - and quite often do - get suspended during SQL execution while awaiting the end of an in-progress prefetch read I/O. Here's why: suppose that program ABC has issued a SELECT statement, and suppose that DB2 has to examine page P1 from table T1 in the course of executing the SELECT. If, at the time of DB2's request for page P1 on behalf of program ABC, P1 is scheduled to be brought into memory by way of a prefetch I/O (and that I/O might have been started on behalf of another program), program ABC will wait on that request to complete. This wait time shows up in a DB2 monitor accounting report (or an online monitor display of thread detail data) as "wait for other read I/O", one of the so-called class 3 suspension times (this because the data reported comes from DB2 accounting trace class 3 records).&lt;br /&gt;&lt;br /&gt;So, when you're looking for a buffer pool that could potentially be enlarged to good effect, performance-wise, look at the TOTAL rate of read I/Os per second for the various pools (using DB2 monitor or -DISPLAY DATABASE DETAIL data, as described in the answer to question 2 above). That would be all synchronous read I/Os (both random and sequential) PLUS all prefetch I/Os (that's prefetch I/Os, not prefetch requests, for sequential prefetch and list prefetch and dynamic prefetch). If you make a buffer pool larger and you want to see if you've done good, look to see if the TOTAL rate of read I/Os has gone down.  If you've caused prefetch read I/O activity to go down a good bit, you might well see a reduction in "wait for other read" time in your DB2 monitor accounting reports, and in DBM1 CPU time in your statistics reports.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Q4: Can prefetch activity ever have a negative impact on performance?&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;A4: Occasionally, yes. Usually, prefetch is very good for performance, substantially reducing run times for DB2-accessing programs. Sometimes, it can be a drag on performance. I can think of three such scenarios. Scenario 1: a program runs longer than it should, due to dynamic prefetch activity. I explained in my &lt;a href="http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-1-basics.html"&gt;part 1 prefetch post&lt;/a&gt; that dynamic prefetch gets turned on as a result of something called sequential detection. That generally means that most pages accessed by an SQL-issuing program are sequential with respect to each successive page access (and that means that the nth page in a table or index accessed by a program is usually within half of a prefetch quantity - typically 16 pages, when the prefetch quantity is 32 pages - of the page previously accessed by the program).  Suppose that the typical skip-ahead (or behind, in case of a reverse index scan in a DB2 V8 or V9 environment) quantity for a program's data or index access is 16 pages. That's just within the sequential detection threshold, so prefetch gets turned on; however, for each 32-page chunk of table or index data brought into memory, only two or the pages will be requested by the program. It might be better for the program to just request the needed pages individually. If you have this situation (look at an DB2 monitor accounting report for the program, and check to see if there is a low ratio of GETPAGEs to dynamic prefetch requests), you might consider increasing the program's commit frequency, as a commit will "wipe the slate clean" with respect to sequential detection page-access tracking &lt;span style="font-style: italic;"&gt;if the program was bound with the RELEASE(COMMIT) option&lt;/span&gt; (versus the RELEASE(DEALLOCATE) option).&lt;br /&gt;&lt;br /&gt;Scenario 2: a program runs longer than it should because of sequential prefetch. As noted in &lt;a href="http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-1-basics.html"&gt;my part 1 prefetch post&lt;/a&gt;, sequential prefetch brings a pretty big chunk of pages into memory right off the bat when an SQL statement with a sequential prefetch-using access path begins execution. That's usually good, but not so much if you really only want, say, 10 rows out of a 10,000-row result set.  In that case, you might try adding OPTIMIZE FOR 10 ROWS or FETCH FIRST 10 ROWS ONLY to your SELECT statement. That will tell the optimizer that you don't want to access the whole result set, and sequential prefetch will not be selected at bind time.&lt;br /&gt;&lt;br /&gt;Scenario 3: a program runs more slowly due to "abandoned list prefetch." Again, this doesn't happen too often, but there are times when DB2 will choose list prefetch (described in &lt;a href="http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-1-basics.html"&gt;my part 1 prefetch post&lt;/a&gt;) for a statement at bind time, and then will abandon list prefetch in favor of a tablespace scan during statement execution. Check a DB2 monitor accounting or statistics report, and look in the "RID list" section (list prefetch usually involves the sorting of row IDs - aka RIDs - obtained from one or more indexes), and look for the number of times that a RID sort failed (some reports will use "terminated" instead of "failed") due to lack of storage.  If that number is non-zero, consider enlarging your DB2 RID sort pool (accomplished by changing the value of the MAXRBLK installation parameter in the DB2 ZPARM module). Another figure that you might see in the "RID list" section of a DB2 monitor report pertains to RID sort failures due to "exceeding the RDS limit" (RDS being the Relational Data System, a core component of DB2 for z/OS). Here's what that means: if DB2 is engaged in RID list processing, and it determines that more than 25% of a table's RIDs will qualify for a RID sort, it will terminate that RID list processing operation. Roger Hecq, a longtime DB2 expert who works for UBS, recently made a very good point about this type of failure in responding to a question posted to &lt;a href="http://www.idug.org/cgi-bin/wa?A0=DB2-L"&gt;the DB2-L forum&lt;/a&gt;: "If [a] table has grown significantly since the last RUNSTATS and program rebind, then a REORG, RUNSTATS, and rebind will increase the 25% limit, which may help avoid the RID list failure [due to RDS limit exceeded] problem." Excellent point: make sure that DB2 knows how many rows are actually in a table by keeping DB2 catalog data up-to-date via RUNSTATs, keep objects well-organized through regular REORGs, and rebind programs after catalog stats have changed significantly for DB2 database objects on which the program depends.&lt;br /&gt;&lt;br /&gt;OK, that's a wrap for prefetch-related monitoring and tuning.  As I mentioned up front, come back at the beginning of next week for part 3 of me 3-part prefetch entry, which will cover DB2 for z/OS V9 changes that have to do with prefetch.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-5717488170966698553?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/5717488170966698553/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-2-monitoring.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/5717488170966698553'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/5717488170966698553'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-2-monitoring.html' title='DB2 for z/OS Prefetch, Part 2 (Monitoring and Tuning)'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-145637828203975761</id><published>2009-04-13T18:30:00.000-07:00</published><updated>2009-04-13T21:36:32.534-07:00</updated><title type='text'>DB2 for z/OS Prefetch, Part 1 (The Basics)</title><content type='html'>&lt;span style="font-family:arial;"&gt;Prefetch is another of those DB2 for z/OS topics that has been around for a long time but which has recently attracted attention anew. In this post, I'll cover some prefetch basics. I'll follow up in a few days with a post on prefetch monitoring and tuning, and a few days after that I'll post a third entry covering some prefetch-related changes introduced with DB2 for z/OS Version 9.&lt;br /&gt;&lt;br /&gt;A lot of DB2 people understand the prefetch concept quite well.  Prefetch requests have two particularly important characteristics. First, they typically involve the reading of multiple data or index pages into a buffer pool from disk with a single I/O operation. Second, they are &lt;span style="font-style: italic;"&gt;anticipatory&lt;/span&gt; in nature, meaning that DB2 prefetches pages into memory on behalf of an application process (or a utility) before that process explicitly requests those pages, based on the assumption that the process &lt;span style="font-style: italic;"&gt;will&lt;/span&gt; request the pages (or at least a substantial percentage of them). In this sense, prefetch reads are asynchronous with respect to an application process, and they ideally will not cause the process to suspend SQL statement execution while waiting for requested pages to be brought into memory from the disk subsystem (though waits for prefetch reads are possible, as I'll explain in part 2 of my 3-part entry on prefetch - to be posted in a few days). Elimination (or at least significant reduction) of in-DB2 application wait time for read I/Os is what prefetch is all about, and when it works as desired (and it almost always does - I'll cover a few exceptions in my "prefetch, part 2" post) it can dramatically reduce the run time of a DB2-accessing program.&lt;br /&gt;&lt;br /&gt;There are three types of DB2 for z/OS prefetch:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Sequential&lt;/span&gt; - Selected at SQL statement bind time (i.e., when the statement's data access path is optimized) if the DB2 optimizer expects that 1) at least a certain number of pages in a target table or index will be accessed in the course of executing the statement (I believe that the threshold is 8 pages), and 2) those pages will be accessed in a physically sequential manner. At statement execution time, as soon as the target table or index is accessed, two (if I recall correctly) prefetch quantities of sequential pages will be read into the appropriate buffer pool (the prefetch quantity is usually 32 pages for an object defined with a 4K page size). Subsequent to these initial two prefetch read operations, additional requests for a prefetch quantity of pages will be issued each time a "trigger" page is accessed by the executing SQL statement (a trigger page is one that is a multiple of the prefetch quantity relative to the first page accessed in the table or index; thus, DB2 tries to stay at least one prefetch quantity of pages "ahead" of the application process).&lt;br /&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;List&lt;/span&gt; - Row IDs (RIDs) for qualifying rows (per a query's predicates) are obtained from one or more indexes, and the corresponding data pages - which need not be sequential - are read into memory via multi-page I/O operations (a RID indicates the physical location of a row in a DB2 table).  Note that in most cases, DB2 will sort the RIDs obtained from the index or indexes prior to initiating the multi-page read requests.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Dynamic&lt;/span&gt; - DB2 determines at query run time that the pattern of page access for a target table or a utilized index is "sequential enough" to warrant activation of prefetch processing. If the page access pattern subsequently strays from "sequential enough," prefetch processing will be turned off for the query (it will be turned on again if "sequential enough" access resumes).&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;Of these prefetch types, dynamic is the one that is most interesting.  It's activated and deactivated according to a mechanism - known as sequential detection - that works as follows: DB2 tracks pages as they are accessed in the execution of a database-accessing program (the classic example is a singleton SELECT in a do-loop - at bind time DB2 doesn't know that the statement will be executed repeatedly, and that pages in the target table or index might be accessed in a sequential fashion).  When the second page in the target table or index is accessed, DB2 checks to determine whether or not it's within half of the prefetch quantity (i.e., 16 pages, if the prefetch quantity is 32 pages) forward of the first page (or backward, if we're talking about the backward index scan capability introduced with DB2 for z/OS Version 8).  If it is, that second page is noted as being sequential, access-wise, relative to the first.  When the third page is accessed, DB2 checks to see that it's within half a prefetch quantity forward (or backward) of the second page.  If it is, the third pages is noted as being sequential with respect to the second page.  When 5 out of the last 8 pages accessed are sequential in this sense, DB2 turns on prefetch.  It turns prefetch off if the number of sequential pages drops below 5 of the last 8.&lt;br /&gt;&lt;br /&gt;OK, so much for the prefetch level-set.  Come back in two or three days, and I'll have posted a "part 2" entry that will answer some questions pertaining to prefetch monitoring and tuning.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-145637828203975761?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/145637828203975761/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-1-basics.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/145637828203975761'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/145637828203975761'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/04/db2-for-zos-prefetch-part-1-basics.html' title='DB2 for z/OS Prefetch, Part 1 (The Basics)'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-918135514023152591</id><published>2009-03-30T07:31:00.000-07:00</published><updated>2011-09-20T08:02:18.485-07:00</updated><title type='text'>A Refreshingly Cloud-y DB2 Forecast</title><content type='html'>&lt;span style="font-family:arial;"&gt;A couple of months ago, I posted &lt;a href="http://catterallconsulting.blogspot.com/2009/01/databases-in-clouds.html"&gt;an entry to this blog on the topic of cloud computing&lt;/a&gt;. I was spurred to write this entry by an article on cloud computing that I'd recently read in &lt;span style="font-style: italic;"&gt;InformationWeek&lt;/span&gt; magazine. While I found that article to be quite interesting, I was somewhat alarmed at the fact that DB2 was not among the database management systems mentioned therein. "Where is IBM?", I wondered. Sure, I'd noticed bits and pieces of cloud-related DB2 activity, including a &lt;a href="http://www.channeldb2.com/"&gt;ChannelDB2&lt;/a&gt; video by IBM's Bradley Steinfeld that showed &lt;a href="http://www.channeldb2.com/video/db2-expressc-on-amazon-ec2"&gt;how to set up a DB2 Express-C system in Amazon's EC2 cloud computing infrastructure&lt;/a&gt; (&lt;a href="http://www-01.ibm.com/software/data/db2/express/"&gt;Express-C&lt;/a&gt; is a full-function version of DB2 for Linux, UNIX, and Windows that can be used - even in a production environment - on a no-charge basis), but what I really wanted to see from IBM was a cohesive and comprehensive strategy aimed at making DB2 a leader with respect to cloud-based data-serving.&lt;br /&gt;&lt;br /&gt;I'm very pleased to inform that this strategy does exist. It was presented by IBM during a "Chat With The Lab" presentation and conference call conducted on March 25 (and available soon for replay - check &lt;a href="http://www-01.ibm.com/software/data/db2/9/labchats.html"&gt;the DB2 "lab chats" Web page&lt;/a&gt;, and the "DB2 and Cloud Computing" link on that page, for more information and to download the associated presentation). Let me tell you, I liked what I saw and heard during this session. IBM's Leon Katsnelson, Program Director for Data Management Portfolio and Product Management (and the guy behind the &lt;a href="http://freedb2.com/"&gt;FreeDB2 Web site&lt;/a&gt;), said during the call that development of the strategy for driving DB2 utilization in the Cloud was guided by this question: "What technologies and business models can we bring to market to help our customers realize the promise of cloud computing?"  That "promise" refers to the potential of cloud computing to lower costs, enhance flexibility, and increase agility for organizations large and small, across all industries.&lt;br /&gt;&lt;br /&gt;Here are the key elements of IBM's DB2 strategy as it relates to cloud computing:&lt;br /&gt;&lt;/span&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Deliver key technologies to support the private cloud initiatives of DB2-using organizations. &lt;/span&gt;Public clouds (those physically based outside of a using enterprise) often come to mind when one thinks about cloud computing, but leaders at many larger organizations are keenly interested in developing and exploiting private (i.e., in-house) clouds that would function as public clouds do in terms of real-time Web, application, and data server instantiation and scaling. To aid these initiatives, IBM is delivering full support for DB2 in virtualized environments, enhancing and standardizing DB2 server provisioning and automation, and providing sub-capacity pricing for cost-effective virtualization (a virtual DB2 "machine" will often use a subset of the processing "cores" available on a physical server, and sub-capacity pricing accommodates this reality).&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Partner with key public cloud providers to fully integrate DB2 into the ecosystem.&lt;/span&gt; Perhaps the best-known of the public clouds is Amazon Web Services (AWS). IBM has partnered with Amazon to provide several options for individuals and organizations seeking to use DB2 in a cloud environment. These include: 1) a pre-built DB2 Amazon Machine Image (AMI) that can be used for development purposes with no associated DB2 software charges (you pay only for the incremental use of Amazon's infrastructure), 2) pre-built DB2 AMIs that can be used for production purposes (pricing for these is expected to be announced in the second quarter of this year), and 3) creating your own DB2 AMI using your existing DB2 licenses. In addition to working with Amazon, IBM has partnered with other leaders in the cloud space to help organizations deploy DB2 in a cloud setting. Representatives of two of these partners - Corent and RightScale - participated in the "Chat With the Lab" call. Corent provides a set of software products, called SaaS Suite, that can enable companies to quickly develop and deploy sophisticated, turnkey, DB2-based "Software as a Service" (SaaS) applications in Amazon's Elastic Compute Cloud (aka EC2, the computer resources utilized by Amazon Web Services users). RightScale provides products and services that help organizations to effectively and efficiently manage their cloud servers (including DB2 servers) and cloud-deployed applications, in Amazon's EC2 and other cloud environments.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Provide a robust DBMS for SaaS vendors.&lt;/span&gt; Cloud computing is a terrific resource for companies that want to develop and sell "Software as a Service" applications (customers of these companies - Salesforce.com is an example of a SaaS vendor - use an application's functionality but don't run the application software in-house), and IBM wants DB2 to be these firms' DBMS of choice. With its combination of attractive pricing, advanced autonomic features (e.g., self-tuning memory management and automated updating of catalog statistics), and industry-leading technologies such as Deep Compression and pureXML, DB2 presents a compelling choice for cloud-based SaaS vendors looking to gain a competitive edge in the marketplace.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-family:arial;"&gt;&lt;span style="font-weight: bold;"&gt;Offer terms and conditions and pricing to make DB2 the best DBMS for the Cloud.&lt;/span&gt; Advanced technology is great, but as mentioned previously, many companies looking to leverage cloud computing resources are aiming for cost savings, improved flexibility, and enhanced agility. If these are your goals, you won't be interested in DBMS software (or any other kind of software) that burdens you financially or overly restricts your deployment options. The folks at IBM get this, and they are determined to make DB2 as attractive to business people as it is to IT pros with respect to running in the Cloud.&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-family:arial;"&gt;Cloud computing is a disruptive technology, and some companies may see it as a threat. IBM's leaders see opportunity in the Cloud, and I believe that they have a strategy in place that will make DB2 a big part of their - and their customers' - success in the cloud computing arena.&lt;br /&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6806654330436722244-918135514023152591?l=catterallconsulting.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://catterallconsulting.blogspot.com/feeds/918135514023152591/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://catterallconsulting.blogspot.com/2009/03/refreshingly-cloud-y-db2-forecast.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/918135514023152591'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/6806654330436722244/posts/default/918135514023152591'/><link rel='alternate' type='text/html' href='http://catterallconsulting.blogspot.com/2009/03/refreshingly-cloud-y-db2-forecast.html' title='A Refreshingly Cloud-y DB2 Forecast'/><author><name>Robert Catterall</name><uri>http://www.blogger.com/profile/12629696535422235653</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='http://bp0.blogger.com/_FeUhA_KCg34/R_-YTGIbb9I/AAAAAAAAAAQ/Odyr4OCmg4I/S220/catterall.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-6806654330436722244.post-2786313211570972981</id><published>2009-03-24T07:32:00.000-07:00</published><updated>2009-03-24T14:18:22.464-07:00</updated><title type='text'>DB2 for z/OS: Overlooking Writes is Wrong</title><content type='html'>&lt;span style="font-family:arial;"&gt;I recently got into an e-mail discussion about DB2 for z/OS I/O performance. The particular concern that sparked the discussion was the impact of synchronous remote disk mirroring on I/O response times ("synchronous remote disk mirroring" refers to a high-availability feature, provided by several vendors of enterprise-class disk subsystems, that keeps data on volumes at a local site in synch with volumes at another site, with the remote site typically being within 30-35 kilometers of the local site). The person who initiated the online exchange was focused on DB2 synchronous read performance, and for him I had a couple of nuggets of information: 1) synchronous remote disk mirroring will generally have little or no impact on DB2 read I/O performance, since only disk writes are replicated to the remote site; and 2) regardless of whether or not disk volumes are remotely mirrored, you don't want to base your assessment of a DB2 for z/OS subsystem's I/O performance solely on the read response times provided by your DB2 monitoring product. In the remainder of this blog post, I'm going to expand on that second point, because DB2 write I/Os do matter.&lt;br /&gt;&lt;br /&gt;One type of DB2 write I/O that can obviously affect application performance is a write to the active log - this because it's a synchronous event with respect to commit processing (a commit operation cannot complete until associated records in the DB2 log buffers have been externalized to disk). The good news here is that you can easily check on an application's wait time due to log writes using a DB2 monitor (it's one of the fields that shows up among the "class 3 suspension" times in a DB2 monitor accounting report or online display of thread detail information).  In most cases, this wait time will be quite small (it should be zero for a read-only transaction). DB2 active log data sets associated with high-volume production systems are generally placed on disk volumes fronted by non-volatile cache memory (non-volatile meaning that a battery backup will keep data that's been written to cache and not yet to spinning disk from being lost in the event of a power failure).  When this is the case, the log write is considered to be complete, from a z/OS (and DB2) perspective, once the log records have been written to disk controller cache memory (they'll be asynchronously destaged to spinning disk by the disk controller at a later time). That's a very fast write. If you see that wait time due to log writes is more than a small percentage of total class 3 suspend time for an application process, it's possible that you have some device contention that needs to be resolved, perh
