Robert's Blog


Tuesday, May 20, 2008

Talking DB2 and SOA in Big D

Howdy, folks. I'm posting today from Dallas, Texas, site of the 2008 IDUG North American Conference (IDUG is the International DB2 Users Group). I've had the opportunity here to participate in some extensive discussions about Service-Oriented Architecture (SOA), one of my favorite subjects (on Sunday I taught a day-long seminar titled "SOA in the Real DB2 World," and yesterday I moderated a Special Interest Group session on "Mainframes, COBOL, and SOA"). Following are capsules of some of the more interesting (to me) exchanges that proceeded from the aforementioned discussions. I'll probably elaborate on several of these in future posts.
  • SOA is way more than Web services. The topic of Web services often dominates conversations about SOA, to the extent that some people might have the impression that use of the former suffices as an implementation of the latter. Such, of course, is not the case. Web services - essentially, application services that are described by a standardized form of XML called Web Services Description Language, aka WSDL - are indeed important, because they can be understood by service-consuming programs and can be invoked in ways that are not dependent upon technical details of service-providing program implementation (i.e., a person coding a service-consuming program can use a Web services call to utilize the functionality of a service-providing program, and he doesn't have to know about things like the hardware platform and operating system on which and under which the service-providing program runs). Abstraction of the "plumbing" of an application system, as provided via Web services, is a key characteristic of an SOA (good for programmer productivity and for application change isolation), but so is service reuse, and the exposure of application functional components as Web services does not necessarily lead to a situation in which those components will be reused to speed up application functionality extension efforts. If functional components are not being reused - if wheels are being reinvented to extend application functionality - then you don't have an SOA, regardless of whether or not you're utilizing Web services.
  • An Enterprise Service Bus (ESB) is not just about inter-application information exchanges. Quite often, an ESB is a foundational part of an organization's SOA, and an ESB is frequently justified as way to facilitate communication between different application systems (often running on different server platforms and implemented using different programming languages and database management systems) for purposes of information exchange (in other words, an ESB make sit easier for data and functionality to be shared across an organization's application systems). There's nothing wrong with this justification (it's a good one), but it suggests that an ESB is just about inter-system data sharing. In fact, another key benefit of an ESB has to do with service reuse (which, as mentioned in the preceding bullet item, is not at all guaranteed through the use of Web services). You see, if application service components are going to be reused, people have to be aware of them, and it really helps people with respect to their awareness of existing application services if those services are registered somewhere. How can you get developers of service components to register same? One way is to make an Enterprise Service Bus THE interface to an organization's application services, and to restrict the services available via the ESB to those cataloged in a registry associated with the ESB (as one discussion participant put it, "if an application service isn't in our ESB registry, it doesn't exist"). An ESB can therefore be of significant value as a means of establishing discipline in an application development environment (e.g., services WILL be accessed via this interface, and they WILL be registered in this repository), and that kind of discipline is important because SOA success often depends as much on process as it does on technology.
  • Code reuse is good, but so is code elimination. If you're going to write code, it's good to take steps that will maximize opportunities to reuse that code, and SOA is certainly about that. Here's another potential benefit of an SOA: code elimination. I'm thinking particularly about rules engines, which are often implemented as part of an Enterprise Service Bus (ESB) solution. The concept here is pretty simple: the ESB is the entry point through which requesters gain access to an organization's portfolio of application services. But how will requests be properly routed to application systems accessible via the ESB? In the case of a relatively complex transaction that will require access to the services of multiple different application systems, and in a certain sequence, to boot, how will the workflow be properly orchestrated? Enter the rules engine, and associated ESB functionality that could be formally referred to as a process engine but which users might call "the traffic cop" or "the control tower." Organizations can use this technology to insert into the process flow rules that specify the way in which various transactions will be handled upon interfacing with the ESB. The ESB can act upon these rules to make sure that work gets done the right way and on the right application systems, and the beauty of it shows when a rule has to be changed (perhaps because of an application system modification or because an extra processing step has been added for a complex transaction): the rules engine is updated accordingly, and that's that. No user code has to be written or altered in order to effect the processing change. You simply inform an intelligent piece of software that the workflow orchestration rules have been updated, and the process engine does the rest. This non-reliance on user-written code to alter transaction processing flows has both an agility advantage (it can be done more quickly versus the user-coded way) and a quality-of-service advantage (user-written code tends to increase the odds of a programming error getting introduced into the production environment). The point is not, of course, to take work away from programmers - it's to get talented programmers working where they deliver the greatest value to the organization: in the development of code modules that deliver functionality to serve the needs of customers and clients. Let process and rules engines take on the work of connecting service requests with service-providing components as needed.
  • Absolutely, COBOL is a good language to use in an SOA. Two different people, in two different discussions, expressed concerns related to the use of COBOL in an SOA. One mentioned that because he almost always heard of object-oriented (OO) programming languages (e.g., Java, C#) in talk about SOA, he'd pretty much concluded that SOA was incompatible with procedural languages such as COBOL. This is, of course, not at all the case. COBOL - a very CPU-efficient language that is familiar to a large number of application developers - can be used very effectively in an SOA. I'm particularly keen about the use of COBOL for programs that are part of an application's data-access layer (logically distinct and loosely coupled presentation, business, and data layers are characteristic of an SOA). On mainframes running DB2, you might be talking about CICS or IMS transactions - or DB2 stored procedures - written in COBOL and containing embedded SQL statements (note that CICS and IMS transactions, and DB2 stored procedures can easily be exposed as Web services - the latter via IBM's Data Studio product). And hey, keep in mind that today's COBOL is not your dad's (or your mom's) COBOL. If you haven't checked out the enhanced XML and Unicode support (and other SOA-friendly goodies) in IBM's Enterprise COBOL for z/OS V4.1, you might want to do so. The other person with the COBOL concern was worried about young people coming put of college without COBOL skills. Would his organization be able to bring in new COBOL programmers as people retired or otherwise left the business, and if not, should they start cutting back on their use of COBOL? I liked the answer provided by one of the other people in the room: "We hire college grads and train them in-house to become proficient with COBOL. In three months they're ready to go." Message: relax.
That's it for now. As previously mentioned, I'll probably expand on at least some of these items in future posts. Thanks for visiting my blog. Y'all come back.

Tuesday, May 13, 2008

DB2 Code Pages: Don't Get Lost in Translation

For a long time, many of us in the DB2 community didn't pay a whole lot of attention to code pages (referring to the standard representations of numbers and letters and other characters within computer systems). There didn't seem to be much of a reason to think about this particular topic. Data went into the database and came out of the database just fine, so what was the big deal about coded character set translation? Oh, sure, everyone knew that an ASCII-based system could exchange data with an EBCDIC-encoding server, and translation had to happen in such cases, but it worked all right. In any case, there were more pressing concerns like database design, performance tuning, and availability planning.

Nowadays, coded character set translation matters A LOT to many DB2-using organizations. The one-word reason: internationalization. Whereas many companies once did business solely within a single country, to an increasing degree those same companies now serve clients located all over the world. This trend (which is accelerating) has caused coded character translation to become a matter of significant importance. Getting it wrong might - if you're lucky - simply annoy a customer who sees that his name is displayed incorrectly on an invoice. If you're less fortunate, mistranslation could result directly in lost business if an incoming order is bounced from your system because some required match on a text field didn't happen as it was supposed to.

Internationalization is far more important than any ASCII-EBCDIC translation issue (the latter is the character encoding scheme historically used on mainframe servers). Why? Because in days of purely domestic markets, ASCII-EBCDIC translation occurred within a given language, such as from French EBCDIC (code page 297) to French ASCII (code page 819 for Linux), or from Czech EBCDIC (code page 870) to Czech ASCII (code page 912). This kind of translation worked fine because with 8 bits available for character representation (EBCDIC was always an 8-bit encoding scheme, and 8-bit ASCII code pages are widely used, though the original ASCII specification was for a 7-bit encoding scheme), all of the "special" characters particular to a given language (most languages, that is - see below) could be represented (and I mean "special" from the U.S. English perspective, referring to such characters as ô, an "o" topped with a circumflex).

[8 bits are not enough to represent all of the characters used in the written forms of languages such as Chinese and Japanese; consequently, double-byte ASCII and EBCDIC code sets were developed to address this limitation.]

It is in going from language A to language B that one is more likely to encounter coded character translation problems when ASCII and/or EBCDIC encoding is in use. For example, suppose you're using the ISO-8859-1 (aka "Latin 1") ASCII code set (which corresponds to the 819 ASCII code page), and you're exchanging data with an organization that uses the ISO-8859-2 code set (code page 912, Central and Eastern Europe)? Code point 241 in your code page represents ñ (an "n" with a tilde on top), but code point 241 in the code page used by the other organization represents ń (an accented "n"). Oh, and there is no representation of ń in your code page, nor is there a representation of
ñ in the code page used by the other organization. Things get more fun if you go between languages with significantly dissimilar alphabets, such as between Greek (ISO-8859-7 ASCII code set) and Turkish (ISO-8859-9).

All of this would present a really intimidating problem, were it not for the Unicode character encoding scheme. With Unicode, up to four bytes (32 bits) are available for character encoding, so there are several billion code points to which characters can be mapped. The great thing about Unicode is that there isn't a Greek Unicode and a Spanish Unicode and a Japanese Unicode - there's just Unicode. Similarly, there isn't one Unicode for mainframe systems and another for Linux, UNIX, and Windows servers - there's just Unicode. If data is sent from one Unicode-using system to another Unicode-using system, no translation takes place (if the data-requesting system uses an ASCII or EBCDIC code page, data coming from a Unicode-based server will be translated as needed). Remember the
ñ I mentioned previously (the "tilde-n")? In Unicode it's represented by 00F1. And the ń (the "accented n")? That's 0144 as represented in Unicode. When there's room for everything, everything can have its own code point (in contrast to the situation in which, for example, different characters map to ASCII code point 241, depending on which ASCII code page is in use).

Both DB2 for z/OS and DB2 for Linux, UNIX, and Windows (LUW) can store data in Unicode format (in fact, this is the default character encoding scheme for DB2 for LUW), and this capability is more and more important as more and more organizations go global in their business dealings. A couple of things to keep in mind:
  • If you are accustomed to working with ASCII code pages, you may have used the DB2 built-in scalar function CHR to get from DB2 the character that maps to a particular ASCII code point (e.g., CHR(241) to get ñ IF you are using the ISO-8859-1 ASCII coded character set). If you store data in a DB2 database in Unicode format, why use CHR to get a particular character from DB2? Ask DB2, instead, to give you the character that maps to a specific Unicode code point (e.g., SELECT &U'\00F1' FROM table-name to get ñ).
  • Changing the Coded Character Set Identifier (CCSID) for an existing DB2 system (e.g., from ASCII or EBCDIC to Unicode) is a non-trivial matter. If you are considering making such a change in your DB2 environment, consult the DB2 for z/OS Installation Guide or the DB2 for LUW Internationalization Guide.
So go Unicode, and don't sweat the ñ (or the ń or the ä or the Ř or the ů or the Σ or the...).

Wednesday, May 7, 2008

Your Other Job

It's been a while since my last post on a non-technical subject. I'm inclined to do write such an entry today, and its my blog so I'm going to. [It's fun now and again to do something because you can - that is, because you're in a position to do so. I remember when U.S. President George H. W. Bush (father of the current President) invoked executive privilege when explaining his food likes and dislikes: "I'm the President and I don't have to eat broccoli."]

Earlier today I was thinking of a conversation I had about 10 years ago with one of IBM's top database technologists. This guy was (and is) a Big Cheese in the DB2 development organization - a bona fide database rock star. I was part of the DB2 National Technical Support team at the IBM Dallas Systems center, and Joe DB2 and I were hanging out in a hotel lounge during a conference. I was saying that I never had been comfortable doing sales-type stuff, and that I was happy to be in a technical role. The Big DB2 Cheese gave me a no-nonsense look and froze me with four little words: "We're all in sales."

I was immediately struck by the truth in that comment, and by the power of such a simple concept.

"We're all in sales." Here was a guy with absolutely bullet-proof tech cred, and he was (figuratively) standing with the sales guys! He did that because he understood the importance of the sales mind-set, and he was kind enough to pass that understanding on to me. It really doesn't matter what you do for a living, or the organization for which you do what you do. You're in sales. If you think otherwise, you're fooling yourself. I remember a big-time professional basketball player saying a few years ago that he wasn't a role model (translation: what I do on and off the court is my business, not anyone else's). Of course he was wrong. Being a star in the National Basketball Association made him a role model to young fans. The relevant question was, would he be a good role model or a not-so-good role model?

It's the same way with this sales business. The question is, do you have a positive impact on your organization's sales, or are you a drag on sales (and EVERY organization - public-sector or private, for-profit or not-for-profit - has customers; hence, EVERY organization is engaged in sales)? In recent years, I've taken the "we're all in sales" axiom and kicked it up a notch. As I see it, not only are you in sales, but most EVERYTHING YOU DO in your work life has a sales impact, and WHAT you do and HOW you do it will determine whether the effect on your organization's sales will be positive or negative.

Over a number of years and in a wide variety of places, I've heard groups of DBAs engage in developer-bashing. Similarly, I've heard plenty of developers speak of DBAs in very unflattering terms. This might seem on the surface to be good fun - a bit of water-cooler talk that promotes team bonding. In truth, however, for the employing organization it's a sales impediment - indirectly, perhaps, but an impediment just the same. See, that kind of mind-set (them=clueless, we=smart) leads a group to focus on taking care of itself. You work hard not to achieve success at the organizational level, but to avoid the appearance of having messed something up yourself (the other guys are the ones who mess things up). If a group that you look down upon is in some trouble, that's their problem. Fiefdoms emerge, people look for ways to gain leverage over other departments, and the human element of IT - the most important element of IT - loses a lot of its potential to deliver value to the larger organization (i.e., to contribute to SALES success).

Suppose you work in an IT group characterized by the kind of us-and-them sniping I've described. If you buy into the idea that "we're all in sales," you'll act to change the dynamics of the situation. That isn't easy, especially if you're trying to do it pretty much on your own. You can start by refusing to join in when people are deriding co-workers. If you're really feeling gutsy, go beyond a refusal to speak poorly of others and actually speak up for the people being ridiculed. At first, you could find yourself on the outs with respect to your teammates, who might look upon you as being self-righteous. If you stand your ground, eventually you could find people moving to the ground on which you're standing, perhaps because they were never really comfortable with the old "elbows-out" work environment but feared upsetting things. You could go further in breaking down sales-impeding barriers by actually reaching out to the groups that have been the targets of past derision. A DBA could actually ask to sit in on some early-stage design meetings for a new application (or a developer could ask to get involved with DBAs on a performance tuning or database redesign project). Again, the early going could be tough, as you might catch no small number of verbal spears in your backside upon first venturing into "enemy territory." Hang in there, though, and you could find that a) you're actually able to add value to what the other team is doing, and b) people on that other team tell you how much they've wished that someone would do exactly what you're doing: reach across boundaries and work for the common good.

Wherever you are on an org chart, you could end up being a change agent who successfully promotes a "we're all in sales" mind-set on the part of your colleagues. Sometimes this kind of positive organizational transformation is led from the top, but other times it gets started on the shop floor. Take a look around. Do you like the way your department functions with respect to teamwork that aims always to drive success for the overall organization? If not, take another look - this time in the mirror. Who knows? You might be the person you've been waiting for to get things going in a more productive direction.

Thursday, May 1, 2008

DB2 for z/OS Accounting: And a One, And a Two...

When people analyze DB2 for z/OS monitor accounting reports or online DB2 monitor displays of thread detail data, they often pay particular attention (with good reason) to the elapsed and CPU time figures contained therein. Some of these same folks are a little confused as to the meaning of the numbers they are examining. They see columns of figures under headings that read (depending on the monitor product) "Class 1" and "Class 2" or "In-Application" and "In-DB2," and wonder what to make of them. In this post I'll provide an explanation of DB2 monitor elapsed and CPU time fields. I hope that you'll find this information to be useful.

First, bear in mind the source of the numbers provided in a DB2 monitor accounting report or an online display of thread detail data: records generated via DB2 for z/OS accounting trace classes 1 and 2 - thus the frequent use in monitor products of "Class 1" and "Class2" as headings for the columns of in-application and in-DB2 (respectively) elapsed and CPU times. [Note that many mainframe DB2-using organizations have DB2 accounting trace classes 1 and 2 (and, typically, 3, as well) running at all times. The associated CPU overhead is generally less than 5%.]

Second, think of the Class 1 and Class 2 times in a DB2 thread context. Generally speaking, an application process gets a thread when it issues its first SQL statement (for an off-mainframe requester coming through the DB2 Distributed Data Facility - aka DDF - that would be a CONNECT statement). When the application process gets the thread (whether it's a local thread for a program in an allied address space or a database access thread - aka DBAT - for a remote DRDA requester), the class 1 and class 2 timers start running, conceptually speaking (the class 2 timer actually gets going a tiny fraction of a second after the class 1 timer, but this really is a very small differential). The class 1 elapsed time is basically from get-thread (could be a new thread, could be a reused thread, depending on the circumstances) until end-of-transaction (or end-of-batch-job).
[I have to point out here, for the sake of completeness, that some persistent threads (such as CICS-DB2 protected entry threads) can stick around after the transaction using the thread completes. The effect of persistent threads on class 1 elapsed time is a little complex, but don't get hung up on that. DBATs can be persistent threads, but if you're using inactive DDF threads (this is a ZPARM thing, referring to the set of DB2 installation parameter value specifications) then the class 1 timer pretty much stops at end-of-transaction.]

OK, so class 1 elapsed is from get-thread to end-of-transaction. Class 2 elapsed is a subset of class 1 elapsed, and is basically that portion of class 1 elapsed time spent in the execution of SQL statements (referred to as "in-DB2" because when your program issues an SQL statement the associated z/OS task switches addressability and executes code in the DB2 Database Services Address Space). If your program calls a stored procedure, part of the class 1 and class 2 times will be associated with the execution of that stored procedure program because the stored procedure - though executing under its own z/OS task - uses the thread of the calling program when issuing SQL statements. The class 1 elapsed time attributed to stored procedure execution is, in essence, the elapsed time of the stored procedure program. The class 2 elapsed time attributed to the stored procedure is that portion of the stored procedure's class 1 time spent in the execution of SQL statements issued by the stored procedure. Similarly, class 1 CPU time is total CPU time for the application process from get-thread to end-of-transaction, and class 2 CPU time is the CPU time consumed in the execution of SQL statements. The assignment of class 1 and class 2 CPU time to a stored procedure called by the application process is like the just-described assignment of class 1 and class 2 elapsed time to a stored procedure.

As time - elapsed or CPU, in-application or in-DB2 - is broken out for any stored procedures called by your program, so is it broken out for any database triggers fired by your program (a trigger might update a column in a row of table X if your program inserts a row into table Y) and for any user-defined functions (UDFs) invoked by your program (DB2 for z/OS has a rich array of built-in functions such as CHARACTER_LENGTH, DAYOFWEEK, FLOOR, and ROUND, but you might need to define a function of your own to meet a special need). The portions of a program's elapsed and CPU times that are NOT associated with called stored procedures or fired triggers or invoked UDFs are referred to as "nonnested."

Class 3 time, by the way, is essentially in-DB2 time that is not CPU time. In other words, the program is in-DB2 because it has issued an SQL statement, but it is not using CPU time because it's waiting on something (maybe waiting to get a lock on a page or row, or waiting for a page to be read into a DB2 buffer pool from the disk subsystem). DB2 (thanks to accounting trace class 3) has about 20 "buckets" into which it can place in-DB2 time that is not CPU time. Sometimes, there is in-DB2 time that DB2 cannot explain (i.e., it's not CPU time and it doesn't fall into any of the class 3 wait-time "buckets"). I wrote a post on this topic (what's called "not-accounted-for time") in January of this year.

Now, get out there and speak confidently of class 1 and class 2 (and even class 3) times. Seriously, the better your understanding of the information contained in a DB2 monitor accounting report or online display of thread detail data, the more useful that information will be to you.