Robert's Blog


Thursday, November 20, 2008

As DB2 Evolves, so, too, do DBA Roles

I recently gave a presentation on DB2 stored procedure trends and technology to a room full of application developers and DB2 DBAs, part of a "lunch and learn" program organized by a large company in the retail industry. Seated next to me at my table was a DB2 DBA manager, and as I sat down after delivering my presentation I mentioned to him that recent DB2 changes emphasizing application enablement (e.g., native SQL procedures in DB2 9 for z/OS, global variables in DB2 9.5 for Linux/UNIX/Windows, the IBM Data Studio development tool) and reducing the need for "hands on" system administration (examples include automated memory and disk storage management enhancements in DB2 9.5 for LUW, and partition-by-growth tablespaces in DB2 9 for z/OS) offered an opportunity for companies to re-think the ways in which DB2 DBAs can be engaged to deliver value to the organization. I went on to say that I was particularly interested in the rise of what might be called the application-focused DB2 DBA.

The DB2 DBA manager smiled and told me that his group had indeed recognized and acted on that opportunity by creating a new role, called "Procedural DBA". The title emphasizes stored procedure technology, which - with DB2 for z/OS V9's great-leap-forward support for native SQL procedures, about which I recently blogged - provided the DB2 DBA team with something to "sell" that the retailer's application development teams were very ready to buy. These application developers are - as they should be - focused on delivering functionality needed by the business. They use modern programming languages and techniques, they need access to DB2 databases on both mainframe and distributed systems platforms, and they need to move fast - which means, among other things, that they don't want to have to have the detailed DB2 database schema knowledge needed to embed SQL DML statements (SELECT, INSERT, UPDATE, DELETE) in their business-tier programs. For the development teams, being able to get the back-end DB2 database work done via stored procedures, and being able to call those procedures in a "native" way with respect to the programming language being used (e.g., via IBM's JDBC or ADO.NET drivers), is a very attractive proposition.

That's the demand side of the DB2 stored procedure equation at the big retailer. On the supply side, the DBA manager sees that all the pieces are now in place. With the native SQL procedure support delivered in DB2 for z/OS V9, preparation and deployment of stored procedures written in SQL no longer depends on having a C compiler (the C compiler requirement was similarly removed via DB2 V9 for the Linux, UNIX, and Windows platforms). Not only that, but the SQL programming extensions made available for native SQL procedures on the mainframe platform (such as nested compound statements) make it easier than ever to develop SQL procedures that can be deployed on different DB2 server platforms. Throw in a couple of cherries on top like the zIIP-eligibility of native SQL procedures called via DRDA (for more cost-efficient mainframe computing) and the great SQL procedure development interface of the aforementioned IBM Data Studio tool, and you've got a package that looks great to everyone from developers to systems programmers.

I told the DBA manager that this all sounded pretty cool, and that I'd expect some "traditional" DB2 DBAs to be eager to take the Procedural DBA role for a spin. Again a smile from the manager, and a gesture to person across the table: "Here's one of our first. You can ask her about it." The newly-minted Procedural DBA did indeed show a lot of enthusiasm in talking about her role. She liked all kinds of things about it, and found especially appealing the fact that it involved working with DB2 on both the mainframe and LUW platforms (DB2 for z/OS and DB2 for LUW have some systems programming and database administration differences, but from an application perspective they are virtually identical). She also welcomed the opportunity to move in a new technical direction and to pick up and hone some new skills. She and her Procedural DBA colleagues are poised to accelerate application implementation (a longtime president of Southwest airlines once said that a plane makes zero money when it's on the ground, and an application doesn't make or save money until its in production), because they'll be involved in the development process from concept through data modeling and data architecture all the way to deployment - no heaving stuff over the transom with a "You guys take it from here" (or, on the other side, picking up the pieces of stuff so heaved). The new Procedural DBA knows that she's caught a good wave, and she's looking forward to riding it for a long time.

New technology can benefit an organization. When the organization itself changes (for example, by defining new roles) to take advantage of new technology, that potential benefit can be substantially magnified. I spent time the other day with people from a large retail-industry company who really get this. That makes for a good day indeed.

Friday, November 14, 2008

DB2 9 for z/OS - A Stored Procedure Game-Changer

Stored procedures have long been a mainstay of enterprise-class applications built on a DB2 for Linux/UNIX/Windows foundation. They provide for robust security (among other things, keeping the SQL DML statements server-side helps to restrict what people with ill intent can learn about your database schema, thereby reducing exposure to hack attacks), scalability (stored procedures can significantly reduce network traffic), and flexibility (packaging table-touching SQL statements in stored procedures is a nice way of abstracting DBMS particulars from client-side developers, allowing these folks to focus more of their coding efforts on business logic).

On the mainframe side, the story's been a little different, with stored procedure usage increasing since delivery of the feature with DB2 for z/OS Version 4, but not quite achieving what I'd call "escape velocity" (with the notable exception of some organizations that have embraced mainframe DB2 stored procedures in a major way). I believe that DB2 9 for z/OS, available since March of 2007, will change that situation.

It is not surprising that DB2 stored procedure usage didn't skyrocket from the get-go on the mainframe platform. For one thing, z/OS users already had some robust and reliable solutions for the server-side packaging of DB2 SQL statements, namely, the CICS and IMS/TM transaction management subsystems. Additionally, DB2 for z/OS stored procedures had some functionality holes early on that needed filling before some organizations would commit resources to leveraging the technology. With that in mind, I'd like to review some of the stored procedure enhancements that have come out in various releases of DB2 over the past dozen years:
  • DB2 V4: Stored procedure functionality introduced.
  • DB2 V5: Program calling a stored procedure can FETCH rows from a result set (previously, data could only be returned via output parameters - a problem for varying-size and many-row result sets).
  • DB2 V6: CREATE/ALTER/DROP PROCEDURE statements added to DB2 DDL (before that, DBAs had to manually insert/update/delete rows in the SYSPROCEDURES catalog table).
  • DB2 V7: SQL Procedure Language introduced, enabling programmers (and development-oriented DBAs) to code stored procedure programs in SQL, versus having to embed SQL statements in stored procedure programs written in COBOL or C (to make this work, DB2 SQL was extended to include logic flow-control statements such as GOTO, IF, ITERATE, LEAVE, LOOP, REPEAT, and WHILE).
  • DB2 V8: Better synergy between DB2 and the z/OS Workload Manager in terms of optimizing the number of tasks in a WLM-managed stored procedure address space, plus the ability to specify an abend limit (a number of execution failures after which a stored procedure is placed in stopped status) at the individual stored procedure level (especially handy in a development environment).
  • DB2 V9: Native SQL procedures, meaning stored procedures that are written in SQL and which execute in the DB2 Database Services address space (aka DBM1).
DB2 V9 native SQL procedure support is what I see as being the game-changer with regard to stored procedure usage on the mainframe platform. Why? Two reasons. First, it provides for a more streamlined stored procedure invocation process, as pointed out by Peggy Zagelow, one of IBM's senior software developers, in a blog entry earlier this year. With external procedures (and SQL procedures of the non-native variety end up executing as external stored procedures coded in C), the underlying program needs a language environment in which to execute. This is provided by a WLM-managed address space, and when the external procedure is called the caller's DB2 thread is suspended while it's switched from the caller's task (SRB or TCB) to the TCB associated with the external procedure program. In some real-world cases, this thread suspension has led to processing delays and increased DBM1 virtual storage consumption. Such problems can be dealt with through adjustments in WLM policy goals and/or DB2 thread limits, but with native SQL procedures they are eliminated entirely. A native SQL procedure exists in the form of a package, which is - as packages always have been - a "runtime structure" generated from the SQL statements to be executed. When a native SQL procedure is called, DB2 just switches from the caller's package to the stored procedure package. No thread-suspension-and-task-switching, and therefore no delay in stored procedure execution. Another benefit of the run-in-DB2 model is the elimination of instruction pathlength associated with crossing back and forth between DBM1 and a stored procedure address space for each SQL statement issued by an external procedure (I don't have numbers, but I expect that native SQL procedures are quite competitive with COBOL external procedures with respect to CPU consumption).

The second game-changing aspect of native SQL procedures is exploitation of zIIP processing resources (referring to the z9 Integrated Information Processor). zIIPs, as you may know already, are specialty engines on a mainframe that can run eligible workloads and which do not factor into mainframe software pricing (as general-purpose engines do). A native SQL procedure is zIIP eligible if it is invoked via a remote call through the Distributed Data Facility component of DB2 (commonly called DDF). Tests have shown that the amount of CPU processing directed to a zIIP can exceed 50% for some stored procedures. These zIIP MIPS are as inexpensive as they get on the mainframe platform, and native SQL procedures offer a great way to use 'em. Why the restriction of remote versus local calls regarding the zIIP eligibility of native SQL procedures? I believe that it reflects IBM's long-term vision of the role of DB2 on System z: a super-reliable, super-scalable, high-ratio-of-capacity-to-footprint data server supporting various application servers (e.g., WebSphere/Java, WebLogic/Java, Windows/.NET, Ruby on Rails, etc.). Basically, IBM with its zIIP initiative is giving you a financial incentive to move towards that architecture.

In addition to those Big Two native SQL procedure pluses, you get some advantages on the programming front versus external SQL procedures. Important in this regard is support for nested compound statements. With the ability to use more than one compound statement in a SQL procedure, you can code compound statements within condition handlers, thereby providing for a native SQL procedure much more sophisticated error-handling capabilities versus an external SQL procedure.

I feel pretty strongly that native SQL procedures are the way of the future as far as DB2 for z/OS is concerned. If you are already using external SQL procedures, plan on migrating these to native SQL procedures in a DB2 V9 New Function Mode environment (not a difficult process). If you are not yet using SQL procedures, I would encourage you to start using them. To help you on your way, take advantage of the recently published IBM "red book" titled DB2 9 Stored Procedures: Through the CALL and Beyond (a complete update of an outstanding red book originally published in 2004 for DB2 V7).

The mainframe DB2 stored procedure wave is now a big one. Catch it.

Wednesday, November 5, 2008

A Couple of Oldie-But-Goodie DB2 Tablespace Questions

Some DB2 for z/OS questions first asked years ago came to be moot due to advances in product technology and functionality. It used to be, for example, that people wanted to know how to reduce index page lock contention, but that was before type 2 indexes (delivered in DB2 V4) eliminated locking on index pages. Similarly, folks used to look for ways to get more than 64 GB of data into a table before that limit went to 16 TB and then to 128 TB (this for a non-LOB table - the size limit is staggeringly large for a table with lots of LOB columns).

On the other hand, there are some questions that were good 15 or more years ago and are still good today. Two that fall into that category were put to me a few weeks ago:
  1. How many tables should be created in a given tablespace?
  2. Which tables in a database should be partitioned?
The first of these questions is, I think, the more interesting of the two. Consider some of the factors that can influence decisions regarding the ratio of tables to tablespaces:
  • Size matters, and sometimes it's all that matters. If a table is going to occupy more than 64 GB of space on disk, the associated tablespace is going to have to be partitioned, and that means that only the one table can be assigned to the tablespace.
  • Production versus non-production databases. Generally speaking, non-production DB2 environments (i.e., those used for test and development purposes) differ from their production counterparts in that they have more tables (there will often be several different versions of each table, in different schemas used at different stages of an application's development) and fewer rows (tables in a performance test database may be close to production-sized, but other non-production tables may have row counts that are 75% or more below production levels). Both these characteristics of non-production DB2 systems - more tables, and smaller tables - make a larger tables-to-tablespaces ratio attractive from a database administration perspective. That's what segmented tablespaces are for.
  • DBD contention, or lack thereof. Every DB2 database (and here I'm using the term "database" in the technical DB2 for z/OS sense, versus the generic notion of a set of logically-related tables) has associated with it a control block (a chunk of system-used information) called a database descriptor, or DBD. Often, we don't think much about DBDs, but some applications are characterized by considerable DDL (data-definition language) activity - related, perhaps, to dynamic view creation for data security purposes - and dynamic (versus static) SQL (particularly common in a data warehouse/business intelligence system). In that kind of environment, one needs to give some thought to the potential for DBD contention associated with concurrently active DDL operations and (sometimes) with dynamic SQL DML (data manipulation language) statements executing at the same time as DDL statements. This might lead you to want more databases (in the DB2 for z/OS technical sense of the word) in your database (generic term). That, in turn, could mean more tablespaces (a given tablespace is associated with one and only one DB2 database, again using the term "database" in the DB2 for z/OS technical sense), and if you have more tablespaces for the same number of tables then of course you'll have fewer tables per tablespace.
  • Utilities. Keep in mind that DB2 utilities tend to operate at the tablespace level. Depending on your situation, that could cause you to want a higher tables-to-tablespaces ratio (nice to back up a bunch of tables by image copying one tablespace) or a lower ratio (you might need the ability to recover an individual table to a prior point in time while leaving data in other tables at currency).
Now, how about the partitioning question? Some things to keep in mind in this regard:
  • Again, size matters. As noted above, you have to partition a table if it's going to hold more than 64 GB of data. That said, partitioning can make lots of sense for tables that are merely large (say, 500 MB or more), as opposed to being huge. Some utilities can operate at the partition level, and it can be beneficial to have the ability to reorganize data (for example) in a large table a partition at a time, rather than REORGing the entire tablespace. The same goes for backup and recovery.
  • Parallelism. While it's not impossible to get some DB2 for z/OS query-splitting activity when tables are not partitioned, parallelism is, to a very significant degree, driven by partitioning: the more partitions you have, the more parallel query processing you're likely to get (assuming that you have packages bound with DEGREE(ANY) and/or you set the value of the CURRENT DEGREE special register to ANY for dynamic SQL statements). Keep in mind that query parallelism isn't just for data warehouse workloads. It can also substantially reduce run times for read-intensive batch jobs running in operational (i.e., non-BI) systems. Also keep in mind that DB2 query parallelism is a great way to utilize the processing capacity of zIIP engines on a mainframe.
More good news on the partitioning front: the new universal tablespace introduced with DB2 for z/OS V9 gives you all the data size and utility-granularity and query-parallelism benefits of traditional partitioned tablespaces, with the quick mass-delete and good "insert-into-the-middle" performance associated with segmented tablespaces.

More or fewer tables per tablespace? Partition or don't partition? Still good questions, these - and they are questions that don't have one-size-fits-all answers. Know your options, think about the needs of your organization (and your own priorities from a database administration perspective), and make the decisions that are right for your environment.