The Key to DB2 for z/OS DDF Application Performance
Happy New Year, folks.
In this first of my 2008 posts, I will write about the performance of applications that utilize the DB2 for z/OS Distributed Data Facility, aka DDF - the component of DB2 that accepts data requests issued by remote clients ("remote," in this context, refers to a different logical - usually separate physically, as well - server, as opposed to a considerable distance - the remote client system could be 1000 miles from the DB2 for z/OS server, or it could be 20 feet away). DDF also enables DB2 on a mainframe to act as a remote data requester, but my focus today is on DB2 for z/OS in the role of data provider.
In many ways, performance design and tuning for a DDF-using application is just as it is for a local (to the DB2 for z/OS subsystem) application (an SQL statement that runs for an overly long time when executed via a CICS transaction program will likely run for a long time if it gets to DB2 through the DDF). When it comes to queries with multi-row result sets, however, there is a very important difference. The performance of such queries in a DDF application environment is very much dependent on the use by DB2 of a capability called block fetch. Consider a query, issued from a remote client (or embedded in a local DB2 stored procedure called from a remote client), that generates a result set of 100 rows. If DB2 returns the result set to the remote client using block fetch, the rows will be sent in "chunks" through the network (generally speaking, the "chunking factor" depends on the number of rows that will fit into one 32-kilobyte block). If block fetch is NOT used, the result set rows will be sent to the client one at a time (one row in each network transmission). The difference in response time for a DDF-related query when block fetch is and isn't used can easily be an order of magnitude or more.
Here's the good news: it's pretty easy to get DB2 for z/OS to use block fetch. Basically, it's a matter of enabling DB2 to recognize that the query in question will be used solely for the purpose of retrieving data, as opposed to updating or deleting data (sometimes a query is used to fetch data values that are subsequently used to locate rows that will be updated or deleted by the application process). The most straightforward way to do this is to code the query in the form of a DECLARE CURSOR statement with FOR FETCH ONLY or FOR READ ONLY specified. If that option is not available to you for some reason, there are other ways to get DB2 to use block fetch. For more information about these other activators of block fetch, check the DB2 Administration Guide. In the DB2 for z/OS V8 Administration Guide, the section of interest is 5.13.2.1.3, "Block fetching result sets."
You may wonder why block fetch is such a big deal for DDF-using applications, when you're used to getting excellent response time for queries issued from local-to-DB2 programs that retrieve one result set row at a time from the subsystem. The reason is pretty simple: moving a result set row from the DB2 Database Services Address Space to an allied address space (e.g., a CICS region or a batch address space) on the same z/OS system is a virtually instantaneous event compared to the time required to move that same row from DB2 to a network-connected client system. [Note: this is not to say that block fetching is of no value outside the DDF environment. The multi-row fetch capability introduced with DB2 V8 can significantly reduce CPU time for queries issued from locally-connected applications.]
I'll close out this post with a bit of block-fetch-related information that I find to be interesting. Many organizations use DB2 for z/OS stored procedures to boost the scalability and security of their DDF-using applications (that's a good thing). In these stored procedure programs, you'll often see the clause OPTIMIZE FOR n ROWS (with some literal integer value substituted for "n") used as a DB2 optimizer influencer (absent this clause in a SELECT statement, DB2 assumes that the associated application process will retrieve all of the rows in a result set, and that is not always the case). Did you know that for DDF-using applications, OPTIMIZE FOR n ROWS controls result set row blocking, in addition to influencing the DB2 optimizer? If you didn't know that, and you have OPTIMIZE FOR 1 ROW specified for cursor-based SELECT statements in stored procedures that are called via DDF, don't freak out - DB2 will use a blocking factor of 16 for purposes of result set row transmittal to remote clients when a value of 1, 2, or 3 is specified for "n" in the OPTIMIZE FOR n ROWS clause (DB2 for z/OS V8 users who want to be assured of getting this DDF row blocking behavior may want to check out APAR PK57930).
Here's to a great 2008. I hope that you'll visit my blog regularly this year. I'll try to make it worth your while.
In this first of my 2008 posts, I will write about the performance of applications that utilize the DB2 for z/OS Distributed Data Facility, aka DDF - the component of DB2 that accepts data requests issued by remote clients ("remote," in this context, refers to a different logical - usually separate physically, as well - server, as opposed to a considerable distance - the remote client system could be 1000 miles from the DB2 for z/OS server, or it could be 20 feet away). DDF also enables DB2 on a mainframe to act as a remote data requester, but my focus today is on DB2 for z/OS in the role of data provider.
In many ways, performance design and tuning for a DDF-using application is just as it is for a local (to the DB2 for z/OS subsystem) application (an SQL statement that runs for an overly long time when executed via a CICS transaction program will likely run for a long time if it gets to DB2 through the DDF). When it comes to queries with multi-row result sets, however, there is a very important difference. The performance of such queries in a DDF application environment is very much dependent on the use by DB2 of a capability called block fetch. Consider a query, issued from a remote client (or embedded in a local DB2 stored procedure called from a remote client), that generates a result set of 100 rows. If DB2 returns the result set to the remote client using block fetch, the rows will be sent in "chunks" through the network (generally speaking, the "chunking factor" depends on the number of rows that will fit into one 32-kilobyte block). If block fetch is NOT used, the result set rows will be sent to the client one at a time (one row in each network transmission). The difference in response time for a DDF-related query when block fetch is and isn't used can easily be an order of magnitude or more.
Here's the good news: it's pretty easy to get DB2 for z/OS to use block fetch. Basically, it's a matter of enabling DB2 to recognize that the query in question will be used solely for the purpose of retrieving data, as opposed to updating or deleting data (sometimes a query is used to fetch data values that are subsequently used to locate rows that will be updated or deleted by the application process). The most straightforward way to do this is to code the query in the form of a DECLARE CURSOR statement with FOR FETCH ONLY or FOR READ ONLY specified. If that option is not available to you for some reason, there are other ways to get DB2 to use block fetch. For more information about these other activators of block fetch, check the DB2 Administration Guide. In the DB2 for z/OS V8 Administration Guide, the section of interest is 5.13.2.1.3, "Block fetching result sets."
You may wonder why block fetch is such a big deal for DDF-using applications, when you're used to getting excellent response time for queries issued from local-to-DB2 programs that retrieve one result set row at a time from the subsystem. The reason is pretty simple: moving a result set row from the DB2 Database Services Address Space to an allied address space (e.g., a CICS region or a batch address space) on the same z/OS system is a virtually instantaneous event compared to the time required to move that same row from DB2 to a network-connected client system. [Note: this is not to say that block fetching is of no value outside the DDF environment. The multi-row fetch capability introduced with DB2 V8 can significantly reduce CPU time for queries issued from locally-connected applications.]
I'll close out this post with a bit of block-fetch-related information that I find to be interesting. Many organizations use DB2 for z/OS stored procedures to boost the scalability and security of their DDF-using applications (that's a good thing). In these stored procedure programs, you'll often see the clause OPTIMIZE FOR n ROWS (with some literal integer value substituted for "n") used as a DB2 optimizer influencer (absent this clause in a SELECT statement, DB2 assumes that the associated application process will retrieve all of the rows in a result set, and that is not always the case). Did you know that for DDF-using applications, OPTIMIZE FOR n ROWS controls result set row blocking, in addition to influencing the DB2 optimizer? If you didn't know that, and you have OPTIMIZE FOR 1 ROW specified for cursor-based SELECT statements in stored procedures that are called via DDF, don't freak out - DB2 will use a blocking factor of 16 for purposes of result set row transmittal to remote clients when a value of 1, 2, or 3 is specified for "n" in the OPTIMIZE FOR n ROWS clause (DB2 for z/OS V8 users who want to be assured of getting this DDF row blocking behavior may want to check out APAR PK57930).
Here's to a great 2008. I hope that you'll visit my blog regularly this year. I'll try to make it worth your while.