Robert's Blog

Monday, April 13, 2009

DB2 for z/OS Prefetch, Part 1 (The Basics)

Prefetch is another of those DB2 for z/OS topics that has been around for a long time but which has recently attracted attention anew. In this post, I'll cover some prefetch basics. I'll follow up in a few days with a post on prefetch monitoring and tuning, and a few days after that I'll post a third entry covering some prefetch-related changes introduced with DB2 for z/OS Version 9.

A lot of DB2 people understand the prefetch concept quite well. Prefetch requests have two particularly important characteristics. First, they typically involve the reading of multiple data or index pages into a buffer pool from disk with a single I/O operation. Second, they are anticipatory in nature, meaning that DB2 prefetches pages into memory on behalf of an application process (or a utility) before that process explicitly requests those pages, based on the assumption that the process will request the pages (or at least a substantial percentage of them). In this sense, prefetch reads are asynchronous with respect to an application process, and they ideally will not cause the process to suspend SQL statement execution while waiting for requested pages to be brought into memory from the disk subsystem (though waits for prefetch reads are possible, as I'll explain in part 2 of my 3-part entry on prefetch - to be posted in a few days). Elimination (or at least significant reduction) of in-DB2 application wait time for read I/Os is what prefetch is all about, and when it works as desired (and it almost always does - I'll cover a few exceptions in my "prefetch, part 2" post) it can dramatically reduce the run time of a DB2-accessing program.

There are three types of DB2 for z/OS prefetch:
  • Sequential - Selected at SQL statement bind time (i.e., when the statement's data access path is optimized) if the DB2 optimizer expects that 1) at least a certain number of pages in a target table or index will be accessed in the course of executing the statement (I believe that the threshold is 8 pages), and 2) those pages will be accessed in a physically sequential manner. At statement execution time, as soon as the target table or index is accessed, two (if I recall correctly) prefetch quantities of sequential pages will be read into the appropriate buffer pool (the prefetch quantity is usually 32 pages for an object defined with a 4K page size). Subsequent to these initial two prefetch read operations, additional requests for a prefetch quantity of pages will be issued each time a "trigger" page is accessed by the executing SQL statement (a trigger page is one that is a multiple of the prefetch quantity relative to the first page accessed in the table or index; thus, DB2 tries to stay at least one prefetch quantity of pages "ahead" of the application process).
  • List - Row IDs (RIDs) for qualifying rows (per a query's predicates) are obtained from one or more indexes, and the corresponding data pages - which need not be sequential - are read into memory via multi-page I/O operations (a RID indicates the physical location of a row in a DB2 table). Note that in most cases, DB2 will sort the RIDs obtained from the index or indexes prior to initiating the multi-page read requests.
  • Dynamic - DB2 determines at query run time that the pattern of page access for a target table or a utilized index is "sequential enough" to warrant activation of prefetch processing. If the page access pattern subsequently strays from "sequential enough," prefetch processing will be turned off for the query (it will be turned on again if "sequential enough" access resumes).
Of these prefetch types, dynamic is the one that is most interesting. It's activated and deactivated according to a mechanism - known as sequential detection - that works as follows: DB2 tracks pages as they are accessed in the execution of a database-accessing program (the classic example is a singleton SELECT in a do-loop - at bind time DB2 doesn't know that the statement will be executed repeatedly, and that pages in the target table or index might be accessed in a sequential fashion). When the second page in the target table or index is accessed, DB2 checks to determine whether or not it's within half of the prefetch quantity (i.e., 16 pages, if the prefetch quantity is 32 pages) forward of the first page (or backward, if we're talking about the backward index scan capability introduced with DB2 for z/OS Version 8). If it is, that second page is noted as being sequential, access-wise, relative to the first. When the third page is accessed, DB2 checks to see that it's within half a prefetch quantity forward (or backward) of the second page. If it is, the third pages is noted as being sequential with respect to the second page. When 5 out of the last 8 pages accessed are sequential in this sense, DB2 turns on prefetch. It turns prefetch off if the number of sequential pages drops below 5 of the last 8.

OK, so much for the prefetch level-set. Come back in two or three days, and I'll have posted a "part 2" entry that will answer some questions pertaining to prefetch monitoring and tuning.


Post a Comment

Subscribe to Post Comments [Atom]

<< Home