A DB2 Thread of a Different Type
      I was in Sydney, Australia last week, representing the International DB2 Users Group Board of Directors at the 2008 IDUG Australasia Conference (a really outstanding event this year).  One of the sessions I attended was a half-day seminar on the internals of DB2 V9.5 for Linux/UNIX/Windows (LUW).  The instructor was Chris Eaton, a Senior Product Manager at IBM's Toronto Lab (home of DB2 for LUW) and a high-profile member of the worldwide DB2 community.  I was particularly keen on learning more about the new threaded architecture of DB2 9.5 on the Linux and UNIX platforms (it's had a threaded architecture on the Windows platform since day one), and I was pleased to see that this topic was on top of Chris's agenda.
First off, I'll tell any DB2 for Linux and UNIX people that the move to the threaded model versus the previous process model is a big plus, for several reasons. Chief among these - and this stems from the fact that all of the DB2 functional components share the same address space - is the fact that new memory allocations within the DB2 address space are immediately visible to all DB2 threads. What's the big deal with that? The big deal is that this makes it possible to greatly simplify the DB2 memory management model. A key benefit is a major advance in autonomic memory management by DB2 on the LUW server platforms. In other words, it's now possible for DB2 to manage its own memory resources on a server. DB2 in fact does this very well, and the result, for users who choose the automatic memory management option for a DB2 instance, is a freeing up of DBA time for more value-added work such as database design and application data access architecture.
This all sounded great to me, but I was a little confused by the use of the term "thread." As a DB2 professional I "grew up" in the world of DB2 for z/OS, and in that world "thread" has always referred to a connection between an application process in a z/OS address space (which could be a CICS or an IMS region, or a TSO address space, or a batch initiator address space, or DB2's own Distributed Data Facility address space) and the DB2 database services address space (the one in which SQL statements executed). This connection is, under the covers, essentially a chain of control blocks.
On the DB2 for LUW side, "thread" has a different meaning that is akin to a task control block (TCB) in a z/OS system: it's basically a dispatchable piece of work. In fact, the new option on the DB2 for LUW system command db2pd that returns information about threads (including their proper names) is -edus. This is short for "engine dispatchable units." Enter this command and you'll see the threads used by page prefetchers, page cleaners, the deadlock detector, the log writer and log reader, the agent pool, and so on.
You may wonder why the DB2 for LUW development folks decided to use the word "threaded" to describe this new architecture on the Linux and UNIX platforms, when "thread" had long been used with a different meaning on the z/OS platform. The answer is pretty simple: the term "thread," used to designate a dispatchable piece of work, had long been in use on the Linux/UNIX/Windows platforms, as had the term "process" that described the DB2 subcomponents in the old architecture on Linux and UNIX servers (and the process model was used initially on those platforms because it made more sense than the threaded model, until quite recently).
So, don't get hung up in the system-level terminology differences. Know that internally, DB2 for Linux and UNIX (as previously mentioned, DB2 has always used a threaded model on the Windows platform) now looks quite a bit more similar to the architecture used by DB2 on the mainframe platform. When you are a cross-platform person (and more and more DB2 people are), that's a good thing.
    
    First off, I'll tell any DB2 for Linux and UNIX people that the move to the threaded model versus the previous process model is a big plus, for several reasons. Chief among these - and this stems from the fact that all of the DB2 functional components share the same address space - is the fact that new memory allocations within the DB2 address space are immediately visible to all DB2 threads. What's the big deal with that? The big deal is that this makes it possible to greatly simplify the DB2 memory management model. A key benefit is a major advance in autonomic memory management by DB2 on the LUW server platforms. In other words, it's now possible for DB2 to manage its own memory resources on a server. DB2 in fact does this very well, and the result, for users who choose the automatic memory management option for a DB2 instance, is a freeing up of DBA time for more value-added work such as database design and application data access architecture.
This all sounded great to me, but I was a little confused by the use of the term "thread." As a DB2 professional I "grew up" in the world of DB2 for z/OS, and in that world "thread" has always referred to a connection between an application process in a z/OS address space (which could be a CICS or an IMS region, or a TSO address space, or a batch initiator address space, or DB2's own Distributed Data Facility address space) and the DB2 database services address space (the one in which SQL statements executed). This connection is, under the covers, essentially a chain of control blocks.
On the DB2 for LUW side, "thread" has a different meaning that is akin to a task control block (TCB) in a z/OS system: it's basically a dispatchable piece of work. In fact, the new option on the DB2 for LUW system command db2pd that returns information about threads (including their proper names) is -edus. This is short for "engine dispatchable units." Enter this command and you'll see the threads used by page prefetchers, page cleaners, the deadlock detector, the log writer and log reader, the agent pool, and so on.
You may wonder why the DB2 for LUW development folks decided to use the word "threaded" to describe this new architecture on the Linux and UNIX platforms, when "thread" had long been used with a different meaning on the z/OS platform. The answer is pretty simple: the term "thread," used to designate a dispatchable piece of work, had long been in use on the Linux/UNIX/Windows platforms, as had the term "process" that described the DB2 subcomponents in the old architecture on Linux and UNIX servers (and the process model was used initially on those platforms because it made more sense than the threaded model, until quite recently).
So, don't get hung up in the system-level terminology differences. Know that internally, DB2 for Linux and UNIX (as previously mentioned, DB2 has always used a threaded model on the Windows platform) now looks quite a bit more similar to the architecture used by DB2 on the mainframe platform. When you are a cross-platform person (and more and more DB2 people are), that's a good thing.

2 Comments:
Nice topic. The terminology is always confusing when you use both mainframe and unix/windows computers. The best analogy, according to me, as a unix-thread vs. a CICS-task.
A unix-process is multi-threading and a CICS instance is multi-tasking.
The process-model of a mainframe is so much more mature... Back in the 80's IBM knew and built OS/2 and another companay did not understand and built windows. The rest is history/legacy.
Thanks for the comment. I do have fond memories of OS/2. I ran it on my work PC for much of the 1990s.
Post a Comment
Subscribe to Post Comments [Atom]
<< Home