Catterall Consulting: October 2009

Robert's Blog

Wednesday, October 28, 2009

IBM IOD 2009 - Day 3

Following is some good stuff that I picked up during the course of day 3 of IBM's 2009 Information on Demand conference:

IBM Information Management software executives had some interesting things to say - IBM got some of us bloggers together with some software execs for a Q&A session. A few highlights:

Interest in DB2 pureScale, the recently announced shared-data cluster for the DB2/AIX/Power platform, is strong. Demo sessions at the conference this week were full-up.
It used to be that organizations asked IBM about products. These days, companies are increasingly likely to ask about capabilities. IBM is responding by packaging software (and sometimes hardware) products into integrated offerings designed to fulfill these capability requirements.
New products at the upper end of IBM's information transformation software stack are driving requirements at the foundational level of the stack (where you'll find the database engines such as DB2), and even into IBM's hardware platforms (such as the Power Systems server line).
Regarding software-as-a-service (SaaS) and cloud computing, IBM sees a "broadening of capabilities" with respect to software delivery and pricing models.
The IBM folks in the room were pretty keyed up about the Company's new Smart Archive offerings, which can - among other things - drive cost savings by using discovery and analytics capabilities to determine which information (structured and unstructured data) an organization has to retain and archive.
Jeff Jonas, one of IBM's top scientists, talked about the huge increase in the amount of data streaming into many companies' systems (much of it from various sensors that emit various signals). People may assume that their organization cannot manage this informational in-surge, but Jeff noted that the more data you get into your system, the faster things can go ("It's like a jigsaw puzzle: the more pieces you put together, the more quickly you can correctly place other pieces").
Jeff also spoke of "enterprise amnesia:" a firm has so much information with which to deal that it loses track of some of it. Consequently, a large retailer will sometimes hire a person who had previously been fired for stealing from that same company.

Let's hear it for audience participation - I enjoyed delivering my presentation on DB2 for z/OS data warehouse performance. As usual, I got some great questions and comments from session attendees. After I mentioned that I'm usually comfortable with having more indexes on tables in a data warehouse versus an OLTP data-serving environment (I wrote of this in blog entry posted last year), I was asked if that statement applied to data warehouses that are updated in near-real time relative to source data changes (something that more organizations are doing these days). My response: in a continuously-updated data warehouse (versus a data warehouse updated via an overnight extract/transform/load process), I'd probably be more conservative when it comes to indexing tables.

After I'd covered DB2 query parallelism, a session attendee suggested that in a CPU-constrained mainframe DB2 data warehouse system, adding one or more zIIP engines and turning on query parallelism (something that probably wouldn't be activated in a system with little in the way of CPU head room) could provide a double benefit: more cycles to enable beneficial utilization of DB2's query parallelism capability, and a workload (parallelized queries) that could drive utilization of the cost-effective zIIPs. Spot on - couldn't have said it better myself (I wrote about query parallelism and zIIP engines in a comment that I added to a blog entry that I posted last year).

Bernie Spang is a man on a mission - IBM's Director of Strategy and Marketing for InfoSphere and Information Management software wants companies to have trusted information. Too often, people confuse "trusted" with "secure." "Secure" is important, but "trusted," in this context, refers to data that is reliable, complete, and correct - the kind of data on which you could confidently base important decisions. Bernie is out to make IBM's InfoSphere portfolio the go-to solution for organizations wanting to get to a trusted-information environment. There's a lot there: data architecting, discovery, master data management, and data governance are just a few of the capabilities that can be delivered by way of various InfoSphere offerings. It's all about getting a handle on the state of your data assets, rationalizing inconsistencies and discrepancies, and providing an interface that leads to agreed-upon "true" values (and this has plenty to do with integrating formerly siloed data stores). If you want to get your data house in order, there's a way to get that done.

Chris Eaton wants mainframe DB2 people to be at ease with DB2 for LUW lingo - Chris, one of the technical leaders in the DB2 for Linux, UNIX, and Windows development organization at IBM's Toronto Lab, knows that there are some DB2 for LUW concepts and terminologies that are a little confusing to mainframe DB2 folks, and he wants to clear things up. SQL data manipulation language statements are virtually identical across DB2 platforms, but there are some differences in the DBA and systems programming views of things on the mainframe and LUW platforms (largely a reflection of significantly different operating system and file system architectures and interfaces). In a session on DB2 for LUW for mainframe DB2 people, Chris explained plenty. Some examples:

A copy of DB2 running on an LUW server is an instance. A copy of DB2 running on a mainframe server is a subsystem.
A DB2 for z/OS subsystem has its own catalog. A DB2 for LUW database, several of which can be associated with a DB2 instance, has its own catalog (and its own transaction log - something else that's identified with a subsystem in a mainframe DB2 environment).
So-called installation parameter values are associated with a DB2 subsystem on a mainframe (most of these values are specified in a module known as ZPARM). The bulk of DB2 for LUW installation parameter values are specified at the database level.
A DB2 for z/OS thread is analogous to a DB for LUW agent, and a DB2 for LUW thread is analogous to a mainframe DB2 TCB or SRB (i.e., a dispatchable piece of work in the system).
A mainframe DB2 data set would be called a file in a DB2 for LUW environment, and a mainframe address space would be referred to as memory on an LUW server.
The DB2 for LUW lock list is what mainframe people would call the IRLM component of DB2.
The DB2 for LUW command FORCE APPLICATION is analogous to the -CANCEL THREAD command in a DB2 for z/OS environment.

Chris also passed on some hints and tips:

Self-tuning memory management (the ability for DB2 to automatically monitor and adjust amounts of memory used for things such as page buffering, package caching, and sorting) works very well on the LUW platform, and Chris recommends use of this feature.
Chris favors the use of DMS files (versus SMS) in a DB2 for LUW system, and the use of automatic-storage databases over DMS files for most objects in a DB2 database.
Chris is big on the use of administrative views as a means of easily obtaining DB2 for LUW performance and system information using SQL.

Tomorrow is the last day of the conference. More blogging to come.

Tuesday, October 27, 2009

IBM IOD 2009 - Day 2

Another day done at IBM's 2009 Information on Demand Conference - another day of learning more about DB2, and about technologies used at higher levels of the information transformation software stack. Some take-aways from today's sessions follow.

A good DB2 9 for z/OS migration story- Maria McCoy of the UK Land Registry delivered a very good presentation on her organization's DB2 9 for z/OS migration experience. The Land Registry has one of the world's largest operational (versus decision support) databases, holding almost 40 TB of data. On top of that, the agency recently launched it's first public e-business application, a consequence being that downtime is even less well tolerated than before.

The Land Registry runs DB2 in data sharing mode on a parallel sysplex mainframe cluster. The number of DB2 subsystems across all of the Land Registry's environments (test, development, and production) is about 30.

The DB2 9 migration effort went off well, largely because the Land Registry stays pretty current on system maintenance, with quarterly upgrades of the DB2 service level (Maria confirmed what others have said, indicating that DB2 9 is very stable at the F906 maintenance level and beyond).

For the Land Registry, the primary DB2 9 migration drivers included:

XML support
Spatial data support (spatial awareness had historically been achieved by way of user-written code)
Extensions to online schema changes
Further exploitation of 64-bit addressing
Improved utility CPU efficiency
Indexes on column expressions
Real-time statistics (especially the capability of identifying indexes that have gone a long time without being used for data access)
Larger index page sizes (offering potentially reduced GETPAGE activity due to a reduction in the number of index levels)

An important part of the Land Registry's DB2 9 migration planning effort involved identification of third-party tools used with DB2. The agency identified 42 such products, among these being monitors, middleware, compilers, utilities, file management systems, and legacy software.

A dedicated test system proved to be very valuable. The LoadRunner tool was used to drive online transaction test scripts.

Following the migration to DB2 9, the Land Registry converted all existing simple tablespaces to segmented tablespaces (a good idea, as simple tablespaces can no longer be created in a DB2 9 environment). Maria and her colleagues thought that there were no simple tablespaces in their DB2 databases, but it turned out that 41 such tablespaces did exist.

Among the DB2 9 new features put to good use by the Land Registry are the following:

Indexes on column expressions (thus was achieved a HUGE decrease in CPU time for a batch job containing a query with a predicate involving a column in a substring function)
Clone tables (a table-data-change outage that formerly ran to 5 hours due to time needed to load new data and to inspect the newly loaded data for correctness went to 2 seconds)
Rename column
Rename index

The DB2 9 migration project went from beginning to end in about 12 months. The Land Registry ran with DB2 9 in Conversion Mode for about 2 months in each of their DB2 environments prior to moving to Enable New Function Mode and then to New Function Mode.

The current Information Management software scene - Arvind Krishna, General Manager of IBM's Information Management software business, spoke during a keynote presentation of the challenges faced by organizations dealing with explosive information growth (an estimated 15 petabytes of new data are generated daily - that's about 50 exabytes per year). He went on to talk about the benefits of "workload-optimized systems" being brought to market now by IBM - systems comprised of fully integrated hardware and software offerings that are optimized for specific workloads. An example of a workload-optimized system is IBM's Smart Analytics system, which provides hardware and a comprehensive software stack (with data management, warehousing, and analytics software) in one package that can be quickly and effectively deployed.

Ross Mauri, General Manager of IBM's Power Systems business (formerly called System p), provided information on the current state of the Power line (currently utilizing generation 6 of IBM's RISC-based microprocessor family, with generation 7 now in beta test mode). Ross said that "Power is everywhere," not only in IBM's Power servers but also in supercomputers, cars, all three of the major electronic game consoles, and the Mars Rover ("we have 100% market share on Mars"). From around 17% market share a few years ago, Power systems now has more than 40% of the market for RISC processor-based servers. Particular strengths of the server line include efficiency ("work per watt," as Ross put it), virtualization, management, and resiliency.

Arvind Krishna closed out the keynote session with remarks that spotlighted IBM's close partnership with SAP (the companies have joint development teams and tens of thousands of mutual customers).

DB2 9 for z/OS native SQL procedures are looking very good - Philip Czachorowski of Fidelity Investments presented information related to his company's early experiences with the native SQL procedures feature of DB2 9 for z/OS (I've blogged a number of times on this technology, beginning with an entry posted late last year). Philip talked about DDL extensions that help with the migration of native SQL procedures from development to test to production environments (statements such as ALTER PROCEDURE ADD VERSION and ALTER PROCEDURE ACTIVATE VERSION), and the new SET CURRENT ROUTINE VERSION statement that can facilitate the testing of a new native SQL procedure (Philip also stressed the importance of having a good naming convention for SQL procedure version identifiers, so you'll know what you're executing when running tests).

Performance data presented during the session was most interesting. Philip showed monitor data for one case in which total class 1 CPU time (from a DB2 monitor accounting report) for a native SQL procedure was only 4% greater than that of a comparable stored procedure written in COBOL.

Near the end of his presentation, Philip mentioned that the DB2_LINE_NUMBER clause of the GET DIAGNOSTICS statement could be very helpful in terms of resolving native SQL procedure code problems.

Stream analytics is way cool - Just before dinner, those of us participating in the IOD Blogger Program had an opportunity to spend an hour with IBMers who are working on the System S "stream analytics" technology on which IBM's InfoSphere Streams offering is based. This is cool stuff: stream analytics software, running under Linux on commodity hardware, can be used to analyze vast amounts of incoming data - often signal data produced by various sensors - to identify events or episodes as they occur, thereby enabling a very rapid response capability. The data could be structured or unstructured, and might consist of hydrophone-captured sounds (picking up, perhaps, the clicking of dolphins), radio astronomy signals, manufacturing data, vehicular traffic activity, weather data, telephone communications, or human-health indicators. Picking up on this latter stream category, a specialist in neonatology who has worked with the IBM System S team spoke of her work involving the monitoring of premature infants' vital signs. An electrocardiogram can generate 500 data signals per second, and there are other vital-sign streams that can be analyzed as well (e.g., blood flow data), and all this can be multiplied by several infants in one area being monitored concurrently (important, as an infection in one child could quickly spread to others). System S stream analytics technology is demonstrating the potential to save lives by taking anomaly detection time from 24 hours (using traditional monitoring methods) to seconds.

The IBM researchers then demonstrated the use of System S stream analytics software to analyze automobile traffic patterns in Stockholm, Sweden (500,000 pieces of GPS data per second).

The scalability of the System S technology is remarkable, the programming interface is surprisingly straightforward (people familiar with object-oriented programming languages tend to become proficient in a couple of weeks), and the GUI is pretty intuitive. Who knows how broadly applicable it might end up being (early adopters are largely in the government and health-care industries, but oil companies are also showing interest)? Watch this space, folks.

That's it for now. Tomorrow morning I'll deliver a presentation on DB2 for z/OS data warehouse performance, and tomorrow evening I'll try to post another blog entry.

Monday, October 26, 2009

IBM IOD 2009 - Day 1

Greetings from Las Vegas. Day one of IBM's 2009 Information on Demand conference was a good one. In this post I'll share with you some of the more interesting items of information I picked up in today's sessions. I'll post at the end of days 2, 3, and 4, as well.

The Big Theme: "Information-Led Transformation" - Ambuj Goyal, General Manager for Business Analytics and Process Optimization in IBM's Software Group, kicked off the Grand Opening session with an overview of the Company's Information Management software strategy. He pointed out that IBM has spent $12 billion on its information-on-demand software stack over the past 4 years: $8 billion on acquisitions (such as Cognos and SPSS) and $4 billion on internal development and related activity. That's some serious money, and it reflects the confidence of IBM's executives that we are on the front end of a major change in the way that organizations manage and leverage their data assets. Ambuj professed that information-led transformation would be even bigger in scope and impact than the enterprise resource planning software wave that got started about 20 years ago.

Companies, said Ambuj, are transitioning from information-focused projects to the information-based enterprise - an operational model characterized by the use of rationalized and trusted data to make timely, effective, and predictive (versus reactive) decisions. Frank Kern, a Senior Vice President in IBM's Global Business Services division, joined Ambuj onstage and continued to underscore the importance of organizations developing a predictive decision-making capability. He described a new service line, Business Analytics and Optimization, that will be delivered by a 4000-strong team of consultants. He also talked about the irony of executives reporting a lack of information needed to make good decisions, even as their organizations are awash in data as never before.

During a panel discussion, several IT executives from IBM customer companies shared their experiences related to the use of advanced analytics software:

Shirley Lady of Blue Cross and Blue Shield, a health insurance company, said that "what if" analysis is more important to her organization than ever before, given the major market changes that could result from health care reform in the United States.
Nihad Aytaman, of clothing retailer Elie Tahari, talked about the importance of quick (as well as effective) decision making to his company's efforts to successfully "chase the business" in the very fluid world of fashion.
Debbie Oshman of Chevron, a global energy company, mentioned that her organization was pursuing information-driven enterprise transformation, after having used information well in a project-by-project way. Process optimization and risk mitigation were described as being two key analytics-driven initiatives underway at Chevron.

Following the panel discussion, Arvind Krishna, IBM's General Manager for Information Management software, talked about new developments in his part of the business: DB2 pureScale (about which I recently blogged), smart archiving, smart analytics, a master information hub, InfoSphere streams, Cognos content analytics, and two recent acquisitions: SPSS and ILOG.

An interesting press conference - I joined journalists, analysts, and fellow bloggers for a press conference featuring several senior IBM executives. Announced during the conference were new analytics applications, enhanced stream computing technology, and new Master Information Hub software (you can view the press release on IBM's Web site).

Steve Mills, Senior Vice President and IBM Software Group Executive, talked about a new information-related transformation in light of transformations past:

The PC transformation of the 1980s that enabled personal delivery of information.
The Worldwide Web transformation of the 1990s that made incredible levels of connectivity a reality.
The process-focused transformation of the past decade, which led to improvements in efficiency and effectiveness.

Going on now, said Mills, is an information-led (not just information-focused) transformation, through which organizations are seeking to understand not only processes, but the environment in which they operate. The urgency of this transformation is prompted by two questions: 1) can your organization move fast enough, and 2) can it move smart enough? Helping to make the transformation possible are historically low costs for units of compute power; human interface improvements, such as dashboards, that enable people to quickly absorb and act on information; and the ability to physically place information capture and analysis technology where it could not be placed before. Mills said that 35,000 IBM people are involved in building the Company's "portfolio of capability" regarding advanced analytics - a portfolio that includes software technology and the know-how to put that technology to work for organizations in all kinds of industries.

The press conference concluded with a question-and-answer session. In responding to questions asked by session attendees:

Arvind Krishna said that IBM's Information Management software business had grown at a 14% annual rate over the past three years - in a market that grew at a 6% rate.
It was mentioned that over 50 OEM vendors are delivering analytics capabilities via cloud computing systems using IBM's Cognos Express offering.
Steve Mills indicated that data governance is increasingly seen by organizations as being a mission-critical competence.
Ambuj Goyal said that even as IBM works to be a one-stop-shop provider of the information transformation software stack (software that manages, archives, cleanses, catalogs, integrates, and analyzes data), the Company designs its products to use open standards that make it easier for organizations to use a mix of IBM and third-party products in a stack.
It was explained that IBM is delivering software that can be used to analyze unstructured data on the Web (e.g., what customers are saying about your company's products), with an emphasis on combining that information with information generated using in-house data.

DB2 "X" is coming along just fine - The next release of DB2 for z/OS is mostly coded, with activity now focused mainly on testing. Jeff Josten, an Distinguished Engineer on the DB2 for z/OS development team at IBM's Silicon Valley Lab, provided a preview of this coming attraction. A few highlights (CAVEAT: this information is truly of a preview nature - it should not be considered as final until the product is generally available):

A further exploitation of 64-bit addressing should dramatically increase the number of threads that can be concurrently active in a DB2 subsystem.
DB2 X is expected to reduce the CPU consumption of a typical DB2 workload by 5-10% as compared to a DB2 Version 9 environment.
Native SQL stored procedures might get a performance boost of up to 10-20% versus a Version 9 environment.
LOBs (large objects) that can fit onto a page will be in-lined in a base table versus being physically stored in a separate LOB tablespace.
Dynamic statement caching will be more effective for SQL statements that contain literal values versus host variables.
There will be a conversion path available for changing simple, segmented, and "classic" partitioned tablespaces to universal tablespaces.
RUNSTATS will provide an "auto stats" option.
Temporal data support will enable DB2 X to be significantly more useful in the management of data that has "effective" dates (e.g., a change to an insurance policy will become effective on such-and-such a date) and/or which is updated of deleted at some time following initial insert into a table (DB2 will maintain a history of such data changes).
Building of a tablespace compression dictionary will not require a utility execution.
DB2 X will enable data-masking to be specified at the column level.
Private protocol will go away (DRDA is much better anyway), and so will the ability to bind a DBRM directly into a plan (these should be bound into packages anyway).

Rick Bowers has his priorities in order - IBM's Director of DB2 for z/OS development stated repeatedly during his "trends and directions" presentation that "it's all about the customer." If you're a DB2 user, Rick's in your corner. A few of his comments during the session:

Enhancing the capabilities of DB2 for z/OS in data warehouse environments is a big priority for Rick's team.
Migration of DB2 for z/OS-using organizations to DB2 9 is proceeding apace.
100% of the top 100 DB2 for z/OS-using organizations are using DB2 Version 8 or beyond, as are 99+% of the top 200.

Got mashups? They're easier than ever now - IBM gave us blog-folk a preview of two new products that will be formally announced: Version 2 of the IBM Mashup Center, and Cognos 8 Mashup Service. The latter makes it very easy to use Cognos-generated report data in a mashup application, and the former very much simplifies creation, cataloging, discovery, and reuse of mashups (mashups provide a quick and convenient means of combining data from two or more sources, either external or internal to an organization - example sources could be a sales performance report and a CRM system). With these new products (and they don't have to be used together), if you have existing data sources (internal and/or external) you can combine data into useful new representations in very little time and at very little cost. The GUI interface of the Mashup Center is very intuitive (program development skill is not a prerequisite for productive use of the product), and the product's flexibility is impressive: sources can include MQ queues and RSS feeds (among other things - including, of course, Cognos 8 reports via the Cognos 8 Mashup Service), and you can implement security controls that will govern the use of various mashups. Cool!

That's the wrap-up of my day-one experience at IOD. I'll post day two information tomorrow.

Tuesday, October 13, 2009

Wow - DB2 Data Sharing Comes to the AIX/Power Platform

When my youngest child - now 8 years old - was younger still, I would read her a story at bedtime. One of her favorites was "Lilly's Purple Plastic Purse," by Kevin Henkes. Lilly was enthralled by her teacher, Mr. Slinger, and expressed her admiration for him pithily: "'Wow,' said Lilly. That was just about all she could say. 'Wow.'"

That pretty much sums up my reaction to IBM's recent announcement of DB2 pureScale, which essentially brings mainframe DB2 data sharing technology to IBM's Power Systems platform running the AIX operating system: "Wow."

I got involved with DB2 for z/OS data sharing in the mid-1990s, while DB2 Version 4 (in which the feature was delivered) was still in the beta-test phase (I was in IBM's DB2 National Technical Support group at the time). I remember being pretty excited about shared-data architecture (in which multiple DB2 systems share concurrent read/write access to a database stored on shared disk volumes) and the potential for the solution to meet formerly unattainable objectives in terms of workload growth and database uptime. Sure enough, potential became reality, and DB2 for z/OS data sharing on the IBM Parallel Sysplex mainframe cluster became (and still is) the gold standard for enterprise data-serving scalability and availability. It was a huge jump forward, capability-wise, for DB2 on the mainframe platform.

Now here we are in 2009, and DB2 for AIX has taken that big leap forward. pureScale doesn't just meet the shared-data competition in the UNIX marketplace - it changes the game. It will deliver levels of scalability and availability that simply were not possible before. How? Simple: it utilizes the same centralized shared-memory approach to global lock management and data coherency that has worked wonders for organizations that run DB2 for z/OS in data sharing mode. Here's the deal: if you're going to give multiple data servers read/write access to one database, you have a couple of choices when it comes to keeping the different data servers from trashing the consistency of said database: you can have a node directly communicate with all the other nodes regarding data rows that it's changing and data that it has cached locally, or you can go with a centralized approach, in which a data server node posts global lock and global buffer pool information to structures residing in devices that provide a shared-memory resource to the group (and here the term "global" refers to lock and buffer pool information that a node has to make known to other nodes so as to preserve data integrity). The problem with the former solution (one node directly communicates global lock and page cache information to others) is that it doesn't scale well - go beyond 4 nodes or so, and the increase in overhead largely negates the processing capacity of an added node.

DB2 for z/OS data sharing, as people familiar with the technology know, was implemented with the centralized approach. The structures are known as the lock structure, the group buffer pools, and the shared communications area (the latter used to keep member nodes apprised of database objects in an exception state), and the shared-memory devices are called coupling facilities (originally external devices that are increasingly implemented as logical partitions within mainframe servers). The lock structure functions, in part, as a "bulletin board," to which nodes can post global lock information - and at which nodes can access global lock information - in microseconds (the lock structure also stores information about currently held data-changing locks, which helps to speed recovery in the event of a node failure). Members of a DB2 for z/OS data sharing group use the group buffer pools in the coupling facilities to "register interest" in database pages cached locally, so that they can be informed when a locally cached page has been changed by another member (changed pages are written to group buffer pools as part of commit processing, and they can be accessed there by other members in MUCH less time than a retrieval from disk - even from disk controller cache - would require).

This centralized global lock and global page cache mechanism has scaled up very effectively, and I mean in the real world, not just in a demo setting: at the company for which I worked when I was on the user side of the DB2 community, we had a 9-way mainframe DB2 data sharing group that handled a huge workload with very little overhead. I know of a 15-member DB2 for z/OS data sharing group at a large bank, and there could be systems out there with more nodes than that. Centralized management of global lock and page cache information has also paid dividends in the area of availability: the impact of a node failure is minimized, and restart of a failed DB2 member is accelerated. At my former workplace, a member of a production DB2 for z/OS data sharing group terminated abnormally in the middle of the day. It was automatically - and quickly - restarted, and the failure event did not impact our clients (the application workload continued to run on the surviving nodes while the failed member was restarted).

With pureScale, DB2 for AIX users can realize these same shared-data scalability and availability advantages. DB2 pureScale basically provides the functionality that coupling facilities do in a mainframe DB2 data sharing group, housing the global lock, group buffer pool, and shared communications area structures in a super-high-performance shared memory resource. Member systems running AIX and DB2 9.8 (the DB2 release that enables participation in a multi-node shared-data system) connect to the pureScale servers, and the increase in overall processing power as nodes are added is almost linear. In other words, the overhead of concurrent read/write access to the shared database increases only slightly as nodes are added to the group (IBM has demonstrated pureScale configurations with scores of nodes). On the availability front, there's automatic and fast (seconds) restart of a DB2 member in the event of a failure, fast (seconds) release of locks on rows that were being changed by a member DB2 at the time of a failure, and automatic routing of incoming transactions to other members during restart of a failed DB2 member.

Allow me to restate the point for emphasis: the DB2 for z/OS data sharing/parallel sysplex architecture has proven itself for nearly 15 years in tremendously demanding conditions, in terms of throughput and availability requirements, at sites all over the world. Developers at IBM's labs in Toronto (DB2 for Linux/UNIX/Windows) and Austin (Power Systems) have worked for years to bring that architecture to the DB2/AIX/Power platform, leveraging the advanced technology originally brought to the market by their colleagues in San Jose (DB2 for z/OS) and Poughkeepsie (System z). DB2 pureScale is the result of those efforts. As DB2 for z/OS data sharing took the already-high scalability and availability standards of the mainframe DB2 platform and raised them still higher, so pureScale will do for DB2 on AIX/Power, the platform that already sets the standard for UNIX system reliability.

There are other great parallels between DB2 for z/OS data sharing and DB2 pureScale: both are application-transparent, both provide system-managed workload balancing, and both allow for very granular increases in system processing capacity.

There's plenty more to report about pureScale, and I'll try to provide additional information in future posts. For now, I'll highlight a few items that I hope will be of interest to you:

The DB2 for LUW data partitioning feature (DPF) isn't going anywhere. Particularly for data warehouse/business intelligence systems, the shared-nothing clustering architecture implemented via DPF is the best scale-out solution.
Transaction log record sequencing is there. As in a DB2 for z/OS data sharing group, each member in a DB2 pureScale configuration logs changes made by SQL statements executed on that member. If a table that has been changed by SQL statements executing on multiple members has to be recovered, a log record sequence numbering mechanism and read access for any given member to all other members' log files ensures that the roll-forward operation (following restoration of a backup) will apply re-do records in the correct order.
pureScale licensing is very flexible. If you need to temporarily add processing capacity to a DB2 pureScale configuration to handle a workload surge, you pay for the additional peak capacity only when you use it.
Talk to your tool vendors about pureScale. IBM has been working for some time with several vendors of DB2 for LUW tools, to help them get ready for pureScale.

As I mentioned, there's more information to come. Stay tuned. This is big.

Wednesday, October 7, 2009

Thoughts on DB2 for z/OS BACKUP SYSTEM and RESTORE SYSTEM

Recently I worked with an organization that is planning an implementation of SAP's ERP application, with the associated database to be managed by DB2 9 for z/OS. This impending SAP installation was a major impetus for getting DB2 9 in-house, thanks largely to the significant enhancements delivered in that release for the BACKUP SYSTEM and RESTORE SYSTEM utilities. In this entry, I'll provide a brief overview of BACKUP SYSTEM and RESTORE SYSTEM, describe new features of these utilities in a DB2 9 environment, and pass on some related information of the "news you can use" variety.

BACKUP SYSTEM and RESTORE SYSTEM are prime examples of user-driven advances with respect to DB2 functionality. Near the end of the 1990s, several large companies using SAP with DB2 for z/OS met with IBM and SAP to press for a solution to a recovery-preparedness challenge. For these organizations, the traditional means of DB2 data backup - the COPY utility - was not satisfactory, owing to the fact that an SAP-DB2 database could contain tens of thousands of objects. System-wide, disk volume-level backups could be efficiently created using the FlashCopy technology of IBM disk subsystems (other disk storage vendors such as EMC and HDS offer a similar capability), but this approach had two significant drawbacks: 1) it was an outside-of-DB2 process, and 2) recovery of a database (using a system-wide volume-level backup) to a consistent state depended on the existence of system-wide quiesce points established via the DB2 commands -SET LOG SUSPEND and -SET LOG RESUME (the former of these commands was quite disruptive in a high-volume OLTP application environment). The SAP- and DB2-using companies wanted a system-wide backup solution that would take advantage of FlashCopy (or equivalent) technology, be executable through DB2, and allow for recovery with consistency to a user-specified point in time.

IBM's response to this request was delivered with DB2 for z/OS Version 8, in the form of the aforementioned BACKUP SYSTEM and RESTORE system utilities. One of the big advantages of this new solution was the elimination of the formerly required system-wide quiesce points for recovery of a database to a consistent state: with BACKUP SYSTEM and RESTORE SYSTEM, the database could be recovered to any user-specified prior point in time (prior to currency - more on that momentarily), as DB2 would use information in the recovery log to back out any data-changing units of work that were in-flight at the designated point in time (underscoring the importance of this being a DB2-managed backup and recovery process). DB2 9 delivered some important enhancements to the functionality of these utilities, as described below:

Object-level recovery from a system-level backup - With DB2 Version 8, it was only possible to perform a system-level recovery using a system-level backup. In a DB2 9 environment, the RECOVER utility can be used to recover an individual object (e.g., a tablespace) or a set of objects using a system-level backup made with the BACKUP SYSTEM utility.
Recover to currency using RESTORE SYSTEM - Before DB2 9, recovery using RESTORE SYSTEM had to be to a point in time prior to the end of the log. Now, a recovery to currency can be performed by specifying SYSPITR=FFFFFFFFFFFF on the control statement of the DSNJU003 utility (change log inventory) that is executed prior to running RESTORE SYSTEM.
Support for incremental FlashCopy - Initially, FlashCopy - as invoked through the BACKUP SYSTEM utility - will create a full copy of the source volumes on the designated target volumes. Logically speaking, this copy operation is almost instantaneous. The physical copy operation takes more time to complete. If data is written to a source volume while the physical copy on the target is being made, the storage subsystem will check to see if the to-be-changed source-volume track has been copied to the target. If it hasn't, the source track will be copied to the target volume before being changed by the pending write operation. After the initial full copy has been completed (in the physical sense), subsequent copies can be incremental, with only the tracks changed since the last backup being copied from source to target. Thus, the workload on the I/O subsystem is reduced.
Support for backup to tape - With DB2 Version 8, getting a disk copy generated via BACKUP SYSTEM to tape was a manual process. With DB2 9, BACKUP SYSTEM can be used to copy a source backup to tape as it's being written to the target volumes. Alternatively, a backup can be written to tape some time after the physical copy to the target volumes has completed.

Now, the BACKUP SYSTEM and RESTORE SYSTEM utilities are very nice pieces of DB2 functionality, but there are some things you should think about before using them. First of all, even though DB2 9 enables object-level recovery from a system-level backup, you SHOULD NOT expect BACKUP SYSTEM to eliminate the need to use the COPY utility - this for two reasons:

You'll need to use COPY to establish a new "recovery base" for an object following a LOAD REPLACE (or a LOAD RESUME with LOG NO) or an offline REORG with LOG NO (an inline image copy is required if you run an online REORG job).
Object-level recovery from a system-level backup is currently not possible for a data set that has been moved since the system-level backup was created (as would be the case for most REORG and LOAD REPLACE operations). This restriction will be lifted in the near future, but it's there today.

Second, the use of BACKUP SYSTEM and RESTORE SYSTEM requires that the active log and BSDS data sets have an ICF catalog that is separate from the one used for the DB2 catalog and directory and user/application data sets. In other words, it's not sufficient that these two categories of DB2 data sets (active log/BSDS and catalog/directory/application) use different aliases that point to the same ICF catalog. The ICF catalogs themselves have to be different (and the two categories of DB2 data sets have to be in different SMS storage groups). If your BSDS and active log data sets are currently in the same ICF catalog as the DB2 catalog and directory and application objects (i.e., application tablespaces and indexes), you'll need to separate them before using BACKUP SYSTEM and RESTORE SYSTEM. Moving the BSDS and active log data sets to a separate ICF catalog basically involves doing the following:

Define the new ICF catalog and the new high-level qualifier for the BSDS/active logs.
Define the new BSDS and the active log data sets in it.
WHILE DB2 IS DOWN, copy the active log and BSDS data sets to the new data sets (the ones in the separate ICF catalog). The best way to do this is probably to use DFDSS to copy the datasets with the RENAMEU parm specified to change the high-level qualifier.
Then use the DB2 change log inventory utility to fix up the BSDS with the correct log ranges for the current active log (the non-reusable one).
For the rest of the log data sets, you can use the change log inventory utility to delete the old ones (with the old high-level qualifier) from the BSDS and and add the new ones (using the NEWLOG statement) without specifying log ranges. This saves a bunch of time and effort, and you needn't worry about DB2 being able to find log data sets with records in a certain range - as long as the other logs have been archived, DB2 can find the ranges in the archive log.
Note: you do NOT have to use the change log inventory utility with the NEWCAT statement to change the VCAT name in the BSDS, because that VCAT name is the one used for the catalog and directory, and the one you're changing is for the BSDS and active log data sets.
Then start DB2.

Finally, a note on the frequency of BACKUP SYSTEM execution. Organizations that are using the BACKUP SYSTEM utility today are generally running it twice daily for their production systems. IBM recommends keeping at least two system-level backups on disk, but not all organizations can afford to allocate the amount of disk resources required for this. You might end up keeping the most recent backup on disk, with previous backups going to tape.

I encourage you to check out BACKUP SYSTEM and RESTORE SYSTEM. If you have a whole lot of objects in your DB2 database, these utilities could make your backup and recovery processes a whole lot simpler.

Previous Posts

Archives