DB2 for z/OS Prefetch, Part 2 (Monitoring and Tuning)
A few days ago, I posted part 1 of a 3-part entry on prefetch in a DB2 for z/OS environment. That post covered the basics: the different types of prefetch and how they work. In this part 2 entry, I'll cover monitoring and tuning in relation to prefetch. I'll provide information in the form of questions and answers, as I've dealt with a number of such queries over the years. At the beginning of next week, I'll post part 3, which will describe some prefetch-related changes introduced with DB2 for z/OS Version 9. On, now, to the Q's and A's.
Q1: Where does the CPU time associated with prefetch activity get charged?
A1: Allied address spaces (those through which SQL statements get to DB2, examples being a CICS application-owning region, a batch address space, and - for statements sent from remote clients via the DRDA protocol - the DB2 Distributed Data Facility address space) get charged for the CPU time associated with synchronous reads (i.e., single-page, on-demand read I/Os), but for prefetch reads, which are asynchronous, the associated CPU time is charged to the DB2 Database Services Address space (also known as DBM1). In fact, in my experience I've seen that the bulk of DBM1 CPU time is related to prefetch reads and to database write I/Os. When DB2-using organizations take steps that lead to reduced prefetch I/O activity (such as making buffer pools larger - a prefetch request for 32 data or index pages does not lead to an I/O operation if all of the 32 pages are already in memory), they tend to see decreases in DBM1 CPU consumption. And note that we are talking about consumption of general-purpose CPU cycles here. It's true that mainframes have specialized processors that assist with I/O operations, but these reduce - as opposed to eliminate - the general-purpose cycles related to I/Os.
Q2: Where can one get information about prefetch activity?
A2: I tend to rely on two sources: a DB2 monitor product (these are available from IBM and from several 3rd-party software vendors) and the DB2 command -DISPLAY BUFFERPOOL DETAIL. With respect to the DB2 monitor product, I like to look over statistics and accounting detail reports (with "detail" referred to as "long" for some monitors) - the former provide a look at overall activity for a DB2 subsystem, and the latter let you restrict the view to activity related to (for example) a given connection type (e.g., CICS or DRDA), a given connection ID (e.g., a CICS region), or a given so-called correlation ID (this could be a batch job name or a CICS transaction ID). In either case, look at the buffer pool activity section of the report (if this activity is broken out by type, look at the read activity, and note that you'll see, certainly in a statistics report, information broken out by buffer pool). With regard to the CPU cost of prefetch and it's impact on DBM1 address space CPU consumption (as mentioned in the answer to question 1 above), check the DB2 address space CPU times in the statistics report. A statistics or accounting report can be set up to capture activity occurring in a given time period. I usually want to see data for a one- to two-hour period of "busy time" on a system, as opposed to a 24-hour period.
While I like the reports generated by DB2 monitor products, you can also get useful real-time information using the online monitor interface. Just look for the information related to buffer pools for overall numbers, and for "thread detail" information if you want to look at I/O activity for a particular application process.
The -DISPLAY BUFFERPOOL DETAIL command is a handy way to quickly generate very useful buffer pool-related information, including prefetch numbers. You can issue the command so as to get information for a single buffer pool, a list of buffer pools, all active buffer pools, or all pools (active and inactive). Because I like to look at an hour's worth of data, I usually issue the command with the INTERVAL option, wait an hour, and issue it again (also with INTERVAL specified); otherwise, you'll get information for the period of time since the pool (or pools) was last activated, and that could be days or weeks ago.
Whether using monitor reports (or screens) or -DISPLAY BUFFERPOOL DETAIL command output, keep in mind the important distinction between prefetch requests and prefetch I/Os. As previously noted, a prefetch request results in an I/O operation only if one or more of the pages requested are not already in memory. Generally speaking, your focus should be on prefetch I/Os.
Q3: Should one pay attention to prefetch I/Os when it comes to buffer pool sizing?
A3: YES! Often, people engaged in buffer pool analysis zero in on synchronous read activity. Synchronous reads are indeed important, and it's good to reduce that activity if you want to improve the performance of DB2-accessing applications, but prefetch I/Os matter, too, for two important reasons: first (reiterating yet again), they consume CPU cycles. Second, although they are asynchronous and anticipatory (as pointed out in my part 1 prefetch entry) and ideally aimed at getting pages into memory before they are requested by an application process, programs can - and quite often do - get suspended during SQL execution while awaiting the end of an in-progress prefetch read I/O. Here's why: suppose that program ABC has issued a SELECT statement, and suppose that DB2 has to examine page P1 from table T1 in the course of executing the SELECT. If, at the time of DB2's request for page P1 on behalf of program ABC, P1 is scheduled to be brought into memory by way of a prefetch I/O (and that I/O might have been started on behalf of another program), program ABC will wait on that request to complete. This wait time shows up in a DB2 monitor accounting report (or an online monitor display of thread detail data) as "wait for other read I/O", one of the so-called class 3 suspension times (this because the data reported comes from DB2 accounting trace class 3 records).
So, when you're looking for a buffer pool that could potentially be enlarged to good effect, performance-wise, look at the TOTAL rate of read I/Os per second for the various pools (using DB2 monitor or -DISPLAY DATABASE DETAIL data, as described in the answer to question 2 above). That would be all synchronous read I/Os (both random and sequential) PLUS all prefetch I/Os (that's prefetch I/Os, not prefetch requests, for sequential prefetch and list prefetch and dynamic prefetch). If you make a buffer pool larger and you want to see if you've done good, look to see if the TOTAL rate of read I/Os has gone down. If you've caused prefetch read I/O activity to go down a good bit, you might well see a reduction in "wait for other read" time in your DB2 monitor accounting reports, and in DBM1 CPU time in your statistics reports.
Q4: Can prefetch activity ever have a negative impact on performance?
A4: Occasionally, yes. Usually, prefetch is very good for performance, substantially reducing run times for DB2-accessing programs. Sometimes, it can be a drag on performance. I can think of three such scenarios. Scenario 1: a program runs longer than it should, due to dynamic prefetch activity. I explained in my part 1 prefetch post that dynamic prefetch gets turned on as a result of something called sequential detection. That generally means that most pages accessed by an SQL-issuing program are sequential with respect to each successive page access (and that means that the nth page in a table or index accessed by a program is usually within half of a prefetch quantity - typically 16 pages, when the prefetch quantity is 32 pages - of the page previously accessed by the program). Suppose that the typical skip-ahead (or behind, in case of a reverse index scan in a DB2 V8 or V9 environment) quantity for a program's data or index access is 16 pages. That's just within the sequential detection threshold, so prefetch gets turned on; however, for each 32-page chunk of table or index data brought into memory, only two or the pages will be requested by the program. It might be better for the program to just request the needed pages individually. If you have this situation (look at an DB2 monitor accounting report for the program, and check to see if there is a low ratio of GETPAGEs to dynamic prefetch requests), you might consider increasing the program's commit frequency, as a commit will "wipe the slate clean" with respect to sequential detection page-access tracking if the program was bound with the RELEASE(COMMIT) option (versus the RELEASE(DEALLOCATE) option).
Scenario 2: a program runs longer than it should because of sequential prefetch. As noted in my part 1 prefetch post, sequential prefetch brings a pretty big chunk of pages into memory right off the bat when an SQL statement with a sequential prefetch-using access path begins execution. That's usually good, but not so much if you really only want, say, 10 rows out of a 10,000-row result set. In that case, you might try adding OPTIMIZE FOR 10 ROWS or FETCH FIRST 10 ROWS ONLY to your SELECT statement. That will tell the optimizer that you don't want to access the whole result set, and sequential prefetch will not be selected at bind time.
Scenario 3: a program runs more slowly due to "abandoned list prefetch." Again, this doesn't happen too often, but there are times when DB2 will choose list prefetch (described in my part 1 prefetch post) for a statement at bind time, and then will abandon list prefetch in favor of a tablespace scan during statement execution. Check a DB2 monitor accounting or statistics report, and look in the "RID list" section (list prefetch usually involves the sorting of row IDs - aka RIDs - obtained from one or more indexes), and look for the number of times that a RID sort failed (some reports will use "terminated" instead of "failed") due to lack of storage. If that number is non-zero, consider enlarging your DB2 RID sort pool (accomplished by changing the value of the MAXRBLK installation parameter in the DB2 ZPARM module). Another figure that you might see in the "RID list" section of a DB2 monitor report pertains to RID sort failures due to "exceeding the RDS limit" (RDS being the Relational Data System, a core component of DB2 for z/OS). Here's what that means: if DB2 is engaged in RID list processing, and it determines that more than 25% of a table's RIDs will qualify for a RID sort, it will terminate that RID list processing operation. Roger Hecq, a longtime DB2 expert who works for UBS, recently made a very good point about this type of failure in responding to a question posted to the DB2-L forum: "If [a] table has grown significantly since the last RUNSTATS and program rebind, then a REORG, RUNSTATS, and rebind will increase the 25% limit, which may help avoid the RID list failure [due to RDS limit exceeded] problem." Excellent point: make sure that DB2 knows how many rows are actually in a table by keeping DB2 catalog data up-to-date via RUNSTATs, keep objects well-organized through regular REORGs, and rebind programs after catalog stats have changed significantly for DB2 database objects on which the program depends.
OK, that's a wrap for prefetch-related monitoring and tuning. As I mentioned up front, come back at the beginning of next week for part 3 of me 3-part prefetch entry, which will cover DB2 for z/OS V9 changes that have to do with prefetch.
Q1: Where does the CPU time associated with prefetch activity get charged?
A1: Allied address spaces (those through which SQL statements get to DB2, examples being a CICS application-owning region, a batch address space, and - for statements sent from remote clients via the DRDA protocol - the DB2 Distributed Data Facility address space) get charged for the CPU time associated with synchronous reads (i.e., single-page, on-demand read I/Os), but for prefetch reads, which are asynchronous, the associated CPU time is charged to the DB2 Database Services Address space (also known as DBM1). In fact, in my experience I've seen that the bulk of DBM1 CPU time is related to prefetch reads and to database write I/Os. When DB2-using organizations take steps that lead to reduced prefetch I/O activity (such as making buffer pools larger - a prefetch request for 32 data or index pages does not lead to an I/O operation if all of the 32 pages are already in memory), they tend to see decreases in DBM1 CPU consumption. And note that we are talking about consumption of general-purpose CPU cycles here. It's true that mainframes have specialized processors that assist with I/O operations, but these reduce - as opposed to eliminate - the general-purpose cycles related to I/Os.
Q2: Where can one get information about prefetch activity?
A2: I tend to rely on two sources: a DB2 monitor product (these are available from IBM and from several 3rd-party software vendors) and the DB2 command -DISPLAY BUFFERPOOL DETAIL. With respect to the DB2 monitor product, I like to look over statistics and accounting detail reports (with "detail" referred to as "long" for some monitors) - the former provide a look at overall activity for a DB2 subsystem, and the latter let you restrict the view to activity related to (for example) a given connection type (e.g., CICS or DRDA), a given connection ID (e.g., a CICS region), or a given so-called correlation ID (this could be a batch job name or a CICS transaction ID). In either case, look at the buffer pool activity section of the report (if this activity is broken out by type, look at the read activity, and note that you'll see, certainly in a statistics report, information broken out by buffer pool). With regard to the CPU cost of prefetch and it's impact on DBM1 address space CPU consumption (as mentioned in the answer to question 1 above), check the DB2 address space CPU times in the statistics report. A statistics or accounting report can be set up to capture activity occurring in a given time period. I usually want to see data for a one- to two-hour period of "busy time" on a system, as opposed to a 24-hour period.
While I like the reports generated by DB2 monitor products, you can also get useful real-time information using the online monitor interface. Just look for the information related to buffer pools for overall numbers, and for "thread detail" information if you want to look at I/O activity for a particular application process.
The -DISPLAY BUFFERPOOL DETAIL command is a handy way to quickly generate very useful buffer pool-related information, including prefetch numbers. You can issue the command so as to get information for a single buffer pool, a list of buffer pools, all active buffer pools, or all pools (active and inactive). Because I like to look at an hour's worth of data, I usually issue the command with the INTERVAL option, wait an hour, and issue it again (also with INTERVAL specified); otherwise, you'll get information for the period of time since the pool (or pools) was last activated, and that could be days or weeks ago.
Whether using monitor reports (or screens) or -DISPLAY BUFFERPOOL DETAIL command output, keep in mind the important distinction between prefetch requests and prefetch I/Os. As previously noted, a prefetch request results in an I/O operation only if one or more of the pages requested are not already in memory. Generally speaking, your focus should be on prefetch I/Os.
Q3: Should one pay attention to prefetch I/Os when it comes to buffer pool sizing?
A3: YES! Often, people engaged in buffer pool analysis zero in on synchronous read activity. Synchronous reads are indeed important, and it's good to reduce that activity if you want to improve the performance of DB2-accessing applications, but prefetch I/Os matter, too, for two important reasons: first (reiterating yet again), they consume CPU cycles. Second, although they are asynchronous and anticipatory (as pointed out in my part 1 prefetch entry) and ideally aimed at getting pages into memory before they are requested by an application process, programs can - and quite often do - get suspended during SQL execution while awaiting the end of an in-progress prefetch read I/O. Here's why: suppose that program ABC has issued a SELECT statement, and suppose that DB2 has to examine page P1 from table T1 in the course of executing the SELECT. If, at the time of DB2's request for page P1 on behalf of program ABC, P1 is scheduled to be brought into memory by way of a prefetch I/O (and that I/O might have been started on behalf of another program), program ABC will wait on that request to complete. This wait time shows up in a DB2 monitor accounting report (or an online monitor display of thread detail data) as "wait for other read I/O", one of the so-called class 3 suspension times (this because the data reported comes from DB2 accounting trace class 3 records).
So, when you're looking for a buffer pool that could potentially be enlarged to good effect, performance-wise, look at the TOTAL rate of read I/Os per second for the various pools (using DB2 monitor or -DISPLAY DATABASE DETAIL data, as described in the answer to question 2 above). That would be all synchronous read I/Os (both random and sequential) PLUS all prefetch I/Os (that's prefetch I/Os, not prefetch requests, for sequential prefetch and list prefetch and dynamic prefetch). If you make a buffer pool larger and you want to see if you've done good, look to see if the TOTAL rate of read I/Os has gone down. If you've caused prefetch read I/O activity to go down a good bit, you might well see a reduction in "wait for other read" time in your DB2 monitor accounting reports, and in DBM1 CPU time in your statistics reports.
Q4: Can prefetch activity ever have a negative impact on performance?
A4: Occasionally, yes. Usually, prefetch is very good for performance, substantially reducing run times for DB2-accessing programs. Sometimes, it can be a drag on performance. I can think of three such scenarios. Scenario 1: a program runs longer than it should, due to dynamic prefetch activity. I explained in my part 1 prefetch post that dynamic prefetch gets turned on as a result of something called sequential detection. That generally means that most pages accessed by an SQL-issuing program are sequential with respect to each successive page access (and that means that the nth page in a table or index accessed by a program is usually within half of a prefetch quantity - typically 16 pages, when the prefetch quantity is 32 pages - of the page previously accessed by the program). Suppose that the typical skip-ahead (or behind, in case of a reverse index scan in a DB2 V8 or V9 environment) quantity for a program's data or index access is 16 pages. That's just within the sequential detection threshold, so prefetch gets turned on; however, for each 32-page chunk of table or index data brought into memory, only two or the pages will be requested by the program. It might be better for the program to just request the needed pages individually. If you have this situation (look at an DB2 monitor accounting report for the program, and check to see if there is a low ratio of GETPAGEs to dynamic prefetch requests), you might consider increasing the program's commit frequency, as a commit will "wipe the slate clean" with respect to sequential detection page-access tracking if the program was bound with the RELEASE(COMMIT) option (versus the RELEASE(DEALLOCATE) option).
Scenario 2: a program runs longer than it should because of sequential prefetch. As noted in my part 1 prefetch post, sequential prefetch brings a pretty big chunk of pages into memory right off the bat when an SQL statement with a sequential prefetch-using access path begins execution. That's usually good, but not so much if you really only want, say, 10 rows out of a 10,000-row result set. In that case, you might try adding OPTIMIZE FOR 10 ROWS or FETCH FIRST 10 ROWS ONLY to your SELECT statement. That will tell the optimizer that you don't want to access the whole result set, and sequential prefetch will not be selected at bind time.
Scenario 3: a program runs more slowly due to "abandoned list prefetch." Again, this doesn't happen too often, but there are times when DB2 will choose list prefetch (described in my part 1 prefetch post) for a statement at bind time, and then will abandon list prefetch in favor of a tablespace scan during statement execution. Check a DB2 monitor accounting or statistics report, and look in the "RID list" section (list prefetch usually involves the sorting of row IDs - aka RIDs - obtained from one or more indexes), and look for the number of times that a RID sort failed (some reports will use "terminated" instead of "failed") due to lack of storage. If that number is non-zero, consider enlarging your DB2 RID sort pool (accomplished by changing the value of the MAXRBLK installation parameter in the DB2 ZPARM module). Another figure that you might see in the "RID list" section of a DB2 monitor report pertains to RID sort failures due to "exceeding the RDS limit" (RDS being the Relational Data System, a core component of DB2 for z/OS). Here's what that means: if DB2 is engaged in RID list processing, and it determines that more than 25% of a table's RIDs will qualify for a RID sort, it will terminate that RID list processing operation. Roger Hecq, a longtime DB2 expert who works for UBS, recently made a very good point about this type of failure in responding to a question posted to the DB2-L forum: "If [a] table has grown significantly since the last RUNSTATS and program rebind, then a REORG, RUNSTATS, and rebind will increase the 25% limit, which may help avoid the RID list failure [due to RDS limit exceeded] problem." Excellent point: make sure that DB2 knows how many rows are actually in a table by keeping DB2 catalog data up-to-date via RUNSTATs, keep objects well-organized through regular REORGs, and rebind programs after catalog stats have changed significantly for DB2 database objects on which the program depends.
OK, that's a wrap for prefetch-related monitoring and tuning. As I mentioned up front, come back at the beginning of next week for part 3 of me 3-part prefetch entry, which will cover DB2 for z/OS V9 changes that have to do with prefetch.
0 Comments:
Post a Comment
Subscribe to Post Comments [Atom]
<< Home