Catterall Consulting: When Things Old Are New Again

Due to my having been very busy with a client engagement, I've let more time than usual go by since the last post to my blog. The aforementioned client job provided me with an interesting experience in the form of a Q&A session between one of the organization's DB2 DBAs and a group of database-focused developers. Executives at this company had decided not long ago that DB2 would be the data-serving platform of choice going forward, and the developers in the room had long worked with a different DBMS used by the firm. Among the questions asked were some aimed at understanding the reasons for the company's strategic shift with regard to database technology. Why DB2 versus the other DBMS that they knew so well? The DB2 DBA did a great job in responding to these questions, and I found his answers to be quite interesting.

Often, when those of us who've worked with DB2 for a long time think about the product and what makes it particularly attractive in today's market, we call to mind recently announced features that score well in terms of the "cool" factor. How would the DBA reply to the "Why DB2?" question? Would he talk about the pureXML technology that makes DB2 the best-of-breed product for managing both XML documents and traditional relational data? Would he point to the Deep Compression feature that can slash disk space requirements for a database at a surprisingly low CPU cost? Maybe multi-dimensional clustering, a capability that can deliver breakthrough performance results for data that aligns naturally with several different dimensions (e.g., geography, time period, and product category)? Or online, in-place REORG, which gets data order back right with no need for "shadow" tables? No, no, no, and nope (the last of these a nod to my Texas roots).

With all that way cool DB2 stuff available to jazz up his response, the DBA chose instead to emphasize a few of DB2's bedrock qualities that some of us take for granted: a fantastic SQL statement optimizer, tremendous vertical scalability, and manageability. In speaking of these key advantages, the DBA made a compelling case for DB2. Hereinafter, I'll summarize his message to his application development colleagues.

First, the optimizer. Pity the IBM teams in Toronto and San Jose who develop this key component of DB2. They are probably accustomed to hearing from DB2 users who are upset over one of those rare occasions when the optimizer actually chooses a demonstrably suboptimal data access path (and quite often THAT ends up being due to inaccurate or incomplete database statistics in the DB2 catalog). People don't talk about the DB2 optimizer when it works as it's supposed to, so they end up hardly talking about it at all because it just about always does the right thing. To the question-answering DBA, however, DB2's world-class optimizer meant that he and his colleagues could spend lots more time with matters such as database design and deployment, and lots less time trying to tweak queries to get good performance. That the SQL optimizer would be touted as one of DB2's greatest strengths should come as no surprise to those who know the history product. Dr. Pat Selinger and her team at IBM's Almaden Research Center pioneered the cost-based optimization of SQL statements back in the late 1970s, when few people in the larger IT community had even heard the term "relational database." With a lead like that, it shouldn't be too hard to remain the front-runner, but IBM of course never relaxes when it comes to making the DB2 optimizer better and better. The DBA in the Q&A session actually did spend some time talking about one of the more recent optimizer-related features to have come out of DB2 development: materialized query tables, aka MQTs. MQTs can be used to hold the precomputed results of complex queries, and the DB2 optimizer can choose to use one - sometimes with order-of-magnitude query runtime improvement - without a query-submitting end user even having to know of the MQT's existence. As disc jockeys used to say in days of yore, the hits just keep on coming.

On now to vertical scalability. DB2 certainly has - in the form of DB2 for z/OS data sharing and the data partitioning feature of DB2 for Linux, UNIX, and Windows (LUW) - multi-server clustering solutions that offer unmatched horizontal scalability, and that's great, but often what an organization wants for a particular database deployment is a solution that can drive higher and higher levels of throughput as the resources of a single server are boosted through the addition of CPUs and/or server memory. Does this sound simple to you, delivering greater throughput as a server's compute power is increased? It's not. In particular, as microprocessor speed and capacity (multi-core, anyone?) continue to break through already amazing levels, the software engineering challenges associated with preventing data processing bottlenecks from hobbling system performance become more and more daunting. Overcoming these challenges has long been a major focus of the DB2 development organization, and it shows. Data server capacity planning becomes much less of a guessing game when you can count on a DBMS to scale with the servers on which it runs.

Finally, manageability. That may not be a scintillating topic of conversation, but it's hugely important with respect to another kind of DB2 scalability that I call human scalability. This term refers to the ability of an IT organization to expand the deployment of a data-serving platform WITHOUT having to boost the ranks of technology support personnel in a proportional manner. To the organization for which my highlighted DBA works, this is huge. They have a few hundred DB2 databases deployed now, with more to come. Some of these databases contain multiple terabytes of data, and lots of others are sized in the hundreds of gigabytes. A solution that can be set up, maintained, and managed in a people-efficient way is a must under these circumstances, and DB2 flat-out delivers the goods. It's always been very strong from a monitoring perspective, delivering rich performance and availability data via CPU-efficient traces. It continues to allow for more and more database changes and maintenance operations to be accomplished online, with no need for a "window" of inaccessibility from the user perspective. And it is becoming, more and more, a self-managing solution, with the self-tuning memory management (STMM) feature of the DB2 "Viper" release being a particularly noteworthy innovation (STMM was made possible by the extension of the threaded model of internal operation - already there for DB2 on Windows servers - to the UNIX and Linux platforms).

The late, great, Johnny Cash once sang of an encounter from which he "come away [sic] with a different point of view." Thus I came away from listening to a DBA - someone focused on helping his employing organization accomplish necessary work efficiently and effectively - explain to a group of developers why the organization had hitched its data-serving wagon to a horse called DB2. Sometimes the things that make DB2 a winner in the marketplace are the strengths that have been their all along (and which get better all the time). So, keep up with all the new and cool stuff, but don't forget about the solid rock on which that cool stuff stands.

Previous Posts

Monday, June 9, 2008

When Things Old Are New Again

0 Comments:

Post a Comment