On DB2 and Table-Pinning
I recently fielded, via e-mail, an entertaining question about "pinning" a DB2 table in memory (i.e., having the whole table resident in virtual storage on the DB2 server). I use the term "entertaining" not to poke fun at the question (or the questioner), but to convey that I enjoyed answering the question.
Among some people who are familiar with certain non-IBM DBMSs but are relatively new to DB2, there can be a bit of confusion about table-pinning in a DB2 environment. This may be due to the fact that table-pinning is covered in the documentation of some other DBMSs but is not mentioned in any of the DB2 manuals. Thus it is that one might conclude that DB2 does not support the pinning of tables in memory. In fact, DB2 has ALWAYS supported table-pinning - the difference is in nomenclature and in the process by which a DB2 table gets pinned.
In some non-DB2 DBMS environments, a table that is to be pinned in memory must be explicitly marked as such. With DB2, this is not the case. Why? Because some DBMSs do not allow a database administrator or systems programmer to control the size of a buffer pool, nor do they provide support for multiple buffer pools. Instead, there is a single buffer pool (aka a page cache) and the DBMS dynamically determines the amount of memory to be allocated to the pool (based on the available memory on the server and the nature of the database-accessing workload). Given that situation, a table that is not marked as to-be-pinned is unlikely to be entirely resident in memory unless it is quite small and very frequently accessed.
Now, don't get me wrong - DBMS management of buffer pool sizing can be quite useful in some cases, and indeed DB2 Version 9 provides a system-managed buffer pool sizing option. Pinning a table in memory, however, is about the database adminsitrator exerting control, and DB2 DBAs have been able to do this since Version 1 Release 1 back in the early 1980s. How do I pin a DB2 table in memory? Easy - I just assign it to a buffer pool that has enough buffers to hold all the table's pages. Here's a very simple example: I have a DB2 table, called XYZ, that occupies 1000 4K pages (the page size doesn't matter - could be 4K pages, 8K, 16K, whatever). I want to pin this table in memory. I set up buffer pool BP8 (the actual buffer pool name doesn't matter) with 1000 4K buffers. I assign table XYZ (actually, the tablespace in which XYZ is located) to BP8 and, voila, it's pinned in memory. What if I also want to pin table ABC, with 500 4K pages, in memory? No problem. I can set up BP9 with 500 buffers and assign table ABC's tablespace to that buffer pool, or I can add 500 buffers to BP8 and have both XYZ and ABC pinned in that pool.
Of course, you could go overboard with table-pinning. I, for one, would not want to pin a 1-terabyte table in memory. On the other hand, I wouldn't confine my pinning plans to itty-bitty tables. In today's 64-bit addressing world, you really can think about pinning a pretty big DB2 table - say, one that has a hundred thousand or more pages. I might actually do that if said table were really critical to the performance of an important application.
How big is too big, when you're talking about table-pinning? The answer to that question doesn't involve rocket science. First of all, I would start out NOT pinning any tables in memory, and going the pinning route if I need to boost the performance of a certain program or programs by eliminating read I/O wait time to the fullest extent possible. If I do decide to pin, the amount of table-pinning that I do (in terms of the number of pages belonging to pinned tables) will depend in large part on the amount of memory that is available on the server. If my server has 50 gigabytes of memory, should I think about pinning 50 gigabytes' worth of tables in memory? Of course not - you don't ever try to use ALL of a server's memory for DB2. You also, however, don't want to under-utilize the server's memory resource (your company paid for the memory, so you want to get maximum bang for that buck). When I'm trying to determine whether or not I'm using too much of a server's memory for DB2 buffers, I take a look at the rate of demand paging to auxiliary storage (that's a mainframe term, but you Linux/UNIX/Windows people will know what I mean). In other words, I want to see how often a referenced page (and I'm talking about ALL pages in memory, not just those holding DB2 data) has to be read in from auxiliary storage on disk (also known as page data sets in mainframe-ese) because it got paged out of virtual storage by the operating system (a system monitor product can provide this information). If the rate of demand paging to auxiliary storage is less than one per second, my assessment would be that you're under-utilizing server memory, and a good way to rectify that situation would be to make more of that memory available to DB2 in the form of additional buffers, and one possible use of the extra buffer pool space would be more table-pinning. If the rate of demand paging to disk is in the single digits of pages per socond, I'd still be fine with giving more memory to DB2. If the demand paging rate is in the double digits per second, I might want to pass on bulking up the DB2 buffer pools and doing more table-pinning.
Can you pin DB2 indexes in memory, as well as (or instead of) tables? Of course you can. As with tables, it's a matter of assigning an index to a buffer pool that has enough buffers to hold all of the index's pages.
Something I mentioned earlier bears repeating: it's a good idea to start out NOT pinning any DB2 objects in memory. If you do that (i.e., you don't pin anything), and application performance is good, there's no need to pin. Even if you need to cut down on disk read I/Os in order to improve application response time and/or throughput, try doing that by enlarging the buffer pool before you try pinning anything. If a larger buffer pool doesn't do the trick, you can think about pinning certain DB2 objects in memory.
So, don't be put off by differences in nomenclature and database administration in a DB2 environment versus other DBMSs. I put stuff in the trunk of my car, while my friends in the UK speak of stowing items in the boot, but we mean the same thing. One has always been able to pin DB2 objects in memory, even though the DB2 doc doesn't use that particular term. Go ahead and pin, my friends - but pin wisely.
Among some people who are familiar with certain non-IBM DBMSs but are relatively new to DB2, there can be a bit of confusion about table-pinning in a DB2 environment. This may be due to the fact that table-pinning is covered in the documentation of some other DBMSs but is not mentioned in any of the DB2 manuals. Thus it is that one might conclude that DB2 does not support the pinning of tables in memory. In fact, DB2 has ALWAYS supported table-pinning - the difference is in nomenclature and in the process by which a DB2 table gets pinned.
In some non-DB2 DBMS environments, a table that is to be pinned in memory must be explicitly marked as such. With DB2, this is not the case. Why? Because some DBMSs do not allow a database administrator or systems programmer to control the size of a buffer pool, nor do they provide support for multiple buffer pools. Instead, there is a single buffer pool (aka a page cache) and the DBMS dynamically determines the amount of memory to be allocated to the pool (based on the available memory on the server and the nature of the database-accessing workload). Given that situation, a table that is not marked as to-be-pinned is unlikely to be entirely resident in memory unless it is quite small and very frequently accessed.
Now, don't get me wrong - DBMS management of buffer pool sizing can be quite useful in some cases, and indeed DB2 Version 9 provides a system-managed buffer pool sizing option. Pinning a table in memory, however, is about the database adminsitrator exerting control, and DB2 DBAs have been able to do this since Version 1 Release 1 back in the early 1980s. How do I pin a DB2 table in memory? Easy - I just assign it to a buffer pool that has enough buffers to hold all the table's pages. Here's a very simple example: I have a DB2 table, called XYZ, that occupies 1000 4K pages (the page size doesn't matter - could be 4K pages, 8K, 16K, whatever). I want to pin this table in memory. I set up buffer pool BP8 (the actual buffer pool name doesn't matter) with 1000 4K buffers. I assign table XYZ (actually, the tablespace in which XYZ is located) to BP8 and, voila, it's pinned in memory. What if I also want to pin table ABC, with 500 4K pages, in memory? No problem. I can set up BP9 with 500 buffers and assign table ABC's tablespace to that buffer pool, or I can add 500 buffers to BP8 and have both XYZ and ABC pinned in that pool.
Of course, you could go overboard with table-pinning. I, for one, would not want to pin a 1-terabyte table in memory. On the other hand, I wouldn't confine my pinning plans to itty-bitty tables. In today's 64-bit addressing world, you really can think about pinning a pretty big DB2 table - say, one that has a hundred thousand or more pages. I might actually do that if said table were really critical to the performance of an important application.
How big is too big, when you're talking about table-pinning? The answer to that question doesn't involve rocket science. First of all, I would start out NOT pinning any tables in memory, and going the pinning route if I need to boost the performance of a certain program or programs by eliminating read I/O wait time to the fullest extent possible. If I do decide to pin, the amount of table-pinning that I do (in terms of the number of pages belonging to pinned tables) will depend in large part on the amount of memory that is available on the server. If my server has 50 gigabytes of memory, should I think about pinning 50 gigabytes' worth of tables in memory? Of course not - you don't ever try to use ALL of a server's memory for DB2. You also, however, don't want to under-utilize the server's memory resource (your company paid for the memory, so you want to get maximum bang for that buck). When I'm trying to determine whether or not I'm using too much of a server's memory for DB2 buffers, I take a look at the rate of demand paging to auxiliary storage (that's a mainframe term, but you Linux/UNIX/Windows people will know what I mean). In other words, I want to see how often a referenced page (and I'm talking about ALL pages in memory, not just those holding DB2 data) has to be read in from auxiliary storage on disk (also known as page data sets in mainframe-ese) because it got paged out of virtual storage by the operating system (a system monitor product can provide this information). If the rate of demand paging to auxiliary storage is less than one per second, my assessment would be that you're under-utilizing server memory, and a good way to rectify that situation would be to make more of that memory available to DB2 in the form of additional buffers, and one possible use of the extra buffer pool space would be more table-pinning. If the rate of demand paging to disk is in the single digits of pages per socond, I'd still be fine with giving more memory to DB2. If the demand paging rate is in the double digits per second, I might want to pass on bulking up the DB2 buffer pools and doing more table-pinning.
Can you pin DB2 indexes in memory, as well as (or instead of) tables? Of course you can. As with tables, it's a matter of assigning an index to a buffer pool that has enough buffers to hold all of the index's pages.
Something I mentioned earlier bears repeating: it's a good idea to start out NOT pinning any DB2 objects in memory. If you do that (i.e., you don't pin anything), and application performance is good, there's no need to pin. Even if you need to cut down on disk read I/Os in order to improve application response time and/or throughput, try doing that by enlarging the buffer pool before you try pinning anything. If a larger buffer pool doesn't do the trick, you can think about pinning certain DB2 objects in memory.
So, don't be put off by differences in nomenclature and database administration in a DB2 environment versus other DBMSs. I put stuff in the trunk of my car, while my friends in the UK speak of stowing items in the boot, but we mean the same thing. One has always been able to pin DB2 objects in memory, even though the DB2 doc doesn't use that particular term. Go ahead and pin, my friends - but pin wisely.
1 Comments:
Very insightful, thank you. I get frustrated by people going crazy with premature optimizations; trying to over-complicate things. (I suspect their real motive is that they want to add some new technologies to their resumes!)
Post a Comment
Subscribe to Post Comments [Atom]
<< Home