Robert's Blog

Monday, May 24, 2010

Nuggets from DB2 by the Bay, Part 2

More items of information from the 2010 International DB2 Users Group North American Conference, held earlier this month in Tampa (in fact by the bay -- the convention center is on the waterfront).

IBM's Roger Miller delivered a session on DB2 for z/OS Version 10 (in beta release for the past couple of months) with typical enthusiasm (opening line: "This is a BIG version"). Some of his points:

Much attention was paid to making life easier for DBAs. Among the labor-saving features of DB2 10:
  • Automated collection of catalog stats, so you don't have to mess with the RUNSTATS utility if you don't want to.
  • Worry-free scale-up, with DB2 thread-related virtual storage now above the 2 GB "bar" in the DB2 database services address space. The number of concurrently active threads can go WAY up in a DB2 10 environment.
  • Access path stability, a favorite of many DB2 9 users, is enhanced. That makes for worry-free rebinding of packages (and you'll want to rebind to get the aforementioned below-the-bar virtual storage constraint relief, and to get the benefit of optimizer improvements).
  • Reduced catalog contention will allow for more concurrency with regard to CREATE/ALTER/DROP activity.
  • The ability to build a tablespace compression dictionary on the fly removes what had been a REORG utility execution requirement.
Resiliency, efficiency, and growth:
  • DB2 9 for z/OS gave us some nice utility performance enhancements (referring particularly to reduced CPU consumption). DB2 10 delivers significant CPU efficiency gains for user application workloads (batch and OLTP).
  • More-granular DB2 authorities enable people to do their jobs while improving the safeguarding of data assets. An example is SECADM, which does not include data access privileges.
  • "Release 2" of DB2's pureXML support improves performance for applications that access XML data stored in a DB2 database.
  • An "ALTER-then-REORG" path makes it easier to convert existing tablespaces to the universal tablespace type introduced with DB2 9.
Counting down. Roger's 10 favorite DB2 10 features:
  • 10. (tie) Hash access to data records (do with one GETPAGE what formerly might have required five or so GETPAGEs), and index "include" columns (define a unique index, then include one or more additional columns to improve the performance of some queries while retaining the original uniqueness-enforcement characteristic of the index).
  • 9. Improved XML data management performance and usability.
  • 8. Improved SQL portability.
  • 7. Support for temporal (i.e., "versioned") data (something that previously had to be implemented by way of application code).
  • 6. The new, more-granular security roles.
  • 5. More online schema changes.
  • 4. Better catalog concurrency.
  • 3. 5X-10X more concurrent users due to the removal of memory constraints.
  • 2. CPU cost reductions for DB2-accessing application programs.
  • 1. Productivity improvements.
You get a lot of benefits in DB2 10 conversion mode:
  • More CPU-efficient SQL.
  • 64-bit addressing support for more of the EDM pool and for DB2 runtime structures (thread-related virtual storage usage). You'll need to rebind to get this benefit.
  • Improved efficiency of single-row retrieval operations outside of singleton SELECTs, thanks to OPEN/FETCH/CLOSE chaining.
  • Distributed thread reuse for high-performance database access threads (aka DBATs).
  • Improved elapsed times for insert operations, thanks to parallelized index updates (for tables on which multiple indexes have been defined).
  • Support for 1 MB pages.
  • Access path enhancements, including the ability to get index matching for multiple in-list predicates in a query.
  • More query parallelism (good for zIIP engine utilization).
  • More avoidance of view materialization (good for efficiency).
More stuff:
  • Dynamic statement cache hits for statements that are identical except for the values of literals (this requires the use of a new attribute setting).
  • CPU efficiency gains of up to 20% for native SQL procedures (you regenerate the runtime structure via drop and recreate).
  • Hash access to data rows.
  • Index include columns.
  • In-line LOBs (storing of smaller LOB values in base table rows). Roger called these smaller LOBs "SLOBs." LOBs stored in-line in a compressed tablespace will be compressed. In-line storage of LOBs will require a universal tablespace that's in reordered row format (RRF). Said Roger: "RRF is the future."
  • Universal tablespaces can be defined with the MEMBER CLUSTER attribute (good for certain high-intensity insert operations, especially in data sharing environments).
  • "ALTER-then-REORG" to get to a universal tablespace, to change page size, to change DSSIZE (size of a tablespace partition), and to change SEGSIZE. With respect to "ALTER-then-REORG," you'll have the ability to reverse an "oops" ALTER (if you haven't effected the physical change via REORG) with an ALTER TABLESPACE DROP PENDING CHANGES.
  • Online REORG for all catalog and directory tablespaces.
Scalability improvements:
  • Reduced latch contention.
  • A new option that lets data readers avoid having to wait on data inserters.
  • Much more utility concurrency.
  • 64-bit common storage to avoid ECSA constraints.
  • Package binds, data definition language processes, and dynamic SQL can run concurrently.
  • The skeleton package table in the directory will use LOBs. With CLOBs and BLOBs in the DB2 directory, the DSN1CHKR utility won't be needed because there won't be any more links to maintain.
  • SMF records produced by DB2 traces can be compressed: major space savings (maybe 4:1) with a low cost in terms of overhead.
  • "Monster" buffer pools can be used with less overhead.
  • You'll be able to dynamically add active log data sets.
  • You'll be able to grant DBADM authority to an ID for all databases, versus having to do this on a database-by-database basis.
More catalog and directory changes:
  • The catalog and directory will utilize partition-by-growth universal tablespaces (64 GB DSSIZE).
  • There will be more tablespaces (about 60 more).
  • Row-level-locking will be used.
  • The objects will be DB2-managed and SMS-controlled.
It really is a BIG version -- and there's still more to it (I've just provided what I captured in my note-taking during Roger's session). More nuggets to come in my part 3 post about DB2-by-the-bay. Stay tuned.

Monday, May 17, 2010

Nuggets from DB2 by the Bay, Part 1

I had to smile when I saw the thread that Ed Long of Pegasystems started the other day on the DB2-L discussion list. The subject line? "IDUG radio silence." Here it was, IDUG NA week, with the North American Conference of the International DB2 Users Group in full swing in Tampa, Florida, and the usual blogging and tweeting of conference attendees was strangely absent. What's with that? Well, I'll break the silence (thanks for the inspiration, Ed), and I'll start by offering my theory as to why the level of conference-related electronic communication was low: we were BUSY.

Busy. That's the word that comes first to mind when I think of this year's event, and I mean it in a good way. At last year's conference in Denver, the mood was kind of on the down side. Attendance was off, due largely to severe cutbacks in organizations' training and travel budgets - a widespread response to one bear of an economic downturn. Those of us who were able to make it to last May's get-together swapped stories with common themes: How tough is it on you? How many people has your company cut? How down is your business? A lot of us were in batten-down-the-hatches mode, and it was hard to get the ol' positive attitude going.

What a difference a year makes. The vibe at the Tampa Convention Center was a total turnaround from 2009. Attendance appeared to be up significantly, people were smiling, conversation was upbeat and animated, and there was this overall sense of folks being on the move: heading to this session or that one, flagging someone down to get a question answered, lining up future business, juggling conference activities with work-related priorities -- stuff that happens, I guess, at every conference, but it seemed to me that the energy level was up sharply versus last May. To the usual "How's it going" question asked of acquaintances not seen since last year, an oft-heard response was: "Busy!" To be sure, some folks (and I can relate) are crazy busy, trying to work in some eating and sleeping when the opportunity arises, but no one seemed to be complaining. It felt OK to be burning the candle at both ends after the long dry spell endured last year. Optimism is not in short supply, and I hope these positive trends will be sustained in the months and years to come.

In this and some other entries to come (not sure how many -- probably another couple or so) I'll share with you some nuggets of information from the conference that I hope you'll find to be interesting and useful. I'll start with the Tuesday morning keynote session.

The data tsunami: big challenges, but big opportunities, too. The keynote speaker was Martin Wildberger, IBM's VP of Data Management Software Development. He started out talking about the enormous growth in the amount of data that organizations have to manage -- this on top of an already-enormous base. He showed a video with comments by some of the leading technologists in his group, and one of those comments really stuck with me (words to this effect): "You might think that the blizzard of data coming into an organization would blind you, but in fact, the more data you have, the clearer you see." Sure, gaining insight from all that data doesn't just happen -- you need the right technology and processes to make it happen -- but the idea that an organization can use its voluminous data assets to see things that were heretofore hidden -- things that could drive more revenue or reduce costs -- is compelling. As DB2 people, we work at the foundational level of the information software "stack." There's lots of cool analytics stuff going on at higher levels of that stack, but the cool query and reporting and cubing and mining tools just sit there if the database is unavailable. And, data has to get to decision-makers fast. And, non-traditional data (images, documents, XML) has to be effectively managed right along with the traditional numbers and character strings. Much will be demanded of us, and that's good (it'll keep us busy).

Martin mentioned that IBM's overall information management investment priorities are aimed at helping organizations to:
  • Lower costs
  • Improve performance
  • Reuse skills
  • Reduce risk
  • Reduce time-to-value
  • Innovate
He talked up IBM's partnership with Intel, IBM's drive to make it easier for companies to switch to DB2 from other database management systems (especially Oracle), and the "game-changing" impact of DB2 pureScale technology, which takes high availability in the distributed systems world to a whole new level. Martin also highlighted the Smart Analytics Systems, including the 9600 series, a relatively new offering on the System z platform (this is a completely integrated hardware/software/services package for analytics and BI -- basically, the "appliance" approach -- that has been available previously only on the IBM Power and System x server lines). There was also good news on the cloud front: DB2 is getting a whole lot of use in Amazon's cloud.

DB2 10 for z/OS: a lot to like. John Campbell, an IBM Distinguished Engineer with the DB2 for z/OS development organization, took the stage for a while to provide some DB2 10 for z/OS highlights (this version of DB2 on the mainframe platform is now in Beta release):

  • CPU efficiency gains. For programs written in SQL procedure language, or SQLPL (used to develop "native" SQL procedures and -- new with DB2 10 -- SQL user-defined functions), CPU consumption could be reduced by up to 20% versus DB2 9. Programs with embedded SQL could see reduced in-DB2 CPU cost (CPU cost of SQL statement execution) of up to 10% versus dB2 9, just by being rebound in a DB2 10 system. High-volume, concurrent insert processes could see in-DB2 CPU cost reductions of up to 40% in a DB2 10 system versus DB2 9.
  • 64-bit addressing for DB2 runtime structures. John's "favorite DB2 10 feature." With DB2 thread storage going above the 2 GB virtual storage "bar" in a DB2 10 system (after a rebind in DB2 10 Conversion Mode), people will have options that they didn't before (greater use of the RELEASE(DEALLOCATE) bind option, for one thing). DB2 subsystem failures are rare, but when they do happen it's often because of a virtual storage constraint problem. DB2 10 squarely addresses that issue.
  • Temporal data. This refers to the ability to associate "business time" and "system time" values to data records. John pointing out that the concept isn't new. What's new is that the temporal data capabilities are in the DB2 engine, versus having to be implemented in application code.
  • Getting to universal. John pointed out that DB2 10 would provide an "ALTER-then-REORG" path to get from segmented and partitioned tablespaces to universal tablespaces.
  • Access plan stability. This is a capability in DB2 10 that can be used to "lock down" access paths for static AND dynamic SQL.
  • Enhanced dynamic statement caching. In a DB2 10 environment, a dynamic query with literals in the predicates can get a match in the prepared statement cache with a statement that is identical except for the literal values (getting a match previously required the literal values to match, too).
DB2 for LUW performance. John was followed on stage by Berni Schiefer of the DB2 for Linux/UNIX/Windows (LUW) development team. Berni shared some of the latest from the DB2 for LUW performance front:
  • Performance PER CORE is not an official TPC-C metric, but it matters, because the core is the licensing unit for LUW software. It's on a per-core basis that DB2 for LUW performance really shines versus Sun/Oracle and HP systems.
  • SAP benchmarks show better performance versus competing platforms, with FEWER cores.
  • TPC-C benchmark numbers show that DB2 on the IBM POWER7 platform excels in terms of both performance (total processing power) AND price/performance.
  • DB2 is number one in terms of Windows system performance, but the performance story is even better on the POWER platform.
  • Berni pointed out that DB2 is the ONLY DBMS that provides native support for the DECFLOAT data type (based on the IEEE 754r standard for decimal floating point numbers). The POWER platform provides a hardware boost for DECFLOAT operations.
  • DB2 for LUW does an excellent job of exploiting flash drives.
Back to Martin for a keynote wrap-up. Martin Wildberger came back on stage to deliver a few closing comments, returning to the topic of DB2 pureScale. pureScale is a distributed systems (AIX/POWER platform) implementation of the shared data architecture used for DB2 for z/OS data sharing on a parallel sysplex mainframe cluster. That's a technology that has delivered the availability and scalability goods for 15 years. So now DB2 for AIX delivers top-of-class scale-up AND scale-out capabilities.

Martin closed by drawing attention to the IBM/IDUG "Do You DB2?" contest. Write about your experience in using DB2, and you could win a big flat-screen TV. If you're based in North America, check it out (this initial contest does have that geographic restriction).

More nuggets from IDUG in Tampa to come in other posts. Gotta go now. I'm BUSY.