Robert's Blog


Thursday, September 6, 2007

Making SOA Perform

SOA, or Service-Oriented Architecture, is a hot concept these days, and it's more than a concept - a number of organizations have SOA-based applications in production, and more have such applications in various stages of development. You may know what SOA is all about:
  • Agility - The code that comprises an SOA-based application is modularized, i.e., it is contained within blocks that can be reused to build other applications. Programming productivity is further enhanced through abstraction, which serves to make invisible and immaterial to the developer of a service-consuming program all the technical particulars behind a service-providing program (e.g., programming language, server platform and operating system, application server type, DBMS, database schema, etc.). SOA enables an organization's IT group to respond quickly to changing market conditions.
  • Quality - Diagrams of complex applications can look like hairballs, what with all the point-to-point connections between programs and between programs and database objects (can't take credit for the hairball analogy - I first heard it used by Paul Benati, an IT executive at CheckFree Corporation). An SOA greatly reduces the number of connection points within an application, and makes extensive use of standards with respect to describing application-provided services (WSDL, or Web Services Description Language, is a form of XML that is used for this purpose) and invoking same (accomplished by way of SOAP, or Simple Object Access Protocol - another form of XML). The simplified structure characteristic of an SOA makes an application less brittle, i.e., less prone to breaking as a result of code modifications.
  • Operational flexibility - Use of abstraction and standard interfaces in an SOA means that IT systems people can use the platforms and operating systems they want for different parts of the application infrastructure. Programming languages and DBMSs can also be mixed and matched, as can application servers.
What's not to like about SOA? Well, performance might be a concern, given that abstraction is achieved through extra code, which means more in the way of machine instruction pathlength. The reduction in application component interfaces (out with the point-to-point hairball, in with more of a hub-and-spoke service access infrastructure) also tends to increase pathlength. Thus the question: can an SOA-based application perform well? My answer to this question is "Yes," and I believe that two keys to success are the use of asynchronous processing for data-changing operations, and caching for data retrieval operations.

First, asynchronous processing. In the context of this post, I'm talking about unhooking the client end of an application from back-end database changes. Suppose, for example, that a user submits an order through a retailer's Web-based purchasing application. This action will cause some changes to be made in the database on which the application is built. If the application is SOA-based, it may be that the elapsed time required to effect these data changes will be somewhat greater than would be the case for an old-style monolithic application. Does this mean that the user will have to wait longer after clicking on "Submit" before the hourglass on his screen goes away? Maybe not. Suppose the order entry application were to indicate end-of-transaction once the end user's order information is placed on a message queue (managed, perhaps, by IBM's WebSphere MQ - formerly known as MQSeries and often referred to simply as MQ)? That could be pretty fast. The queue can be configured as a recoverable resource, so the order-input data can be retained even if the whole application system crashes. The user can go on his merry way, while a process gets the order information message off the queue and the appropriate database changes are made.

Is there a catch? Yes, but I think it's a little one. Because the update of the order entry database is asynchronous with respect to the end user's clicking on "Submit," if the user were to immediately click on a "View order history" link on the order acknowledgment page, it is conceivable that the user would not see in the resulting display the information pertaining to his just-submitted order - this because the database (from which "order history" information is retrieved) has not yet been updated with the just-submitted order information on the MQ queue. I believe that the risk of this happening is quite small, because it's likely that a few seconds will pass before the user a) thinks about viewing his order history with this retailer, b) locates - and moves his cursor to - the "View order history" link, and c) clicks on the link. A few seconds should be more than enough time for the back end of the order entry application to get the order information message off the MQ queue and to make the corresponding database changes.

Aside from the performance benefit (from the user's perspective) that can accrue through the use of asynchronous processing for database update operations, consider the accompanying boost that MQ can deliver in the area of application availability (again, from the user's perspective). If for some reason the back-end database system is not available for update processing (perhaps due to a server or OS failure, or a program logic error - or because of a database maintenance operation), the front-end queue can still accept messages and the application can still indicate "Order received" to a user who clicks the "Submit" button. Messages stack up on the queue, and when the database is again available the message backlog is processed.

OK, on to caching for data-retrieval operations. Again, the idea is to use technology and application architecture to have your SOA cake (agility/quality/flexibility) and eat it, too (user-satisfying performance). The concept of data caching is relatively simple: you store a copy of oft-retrieved data in cache servers closer to the edge of an application's infrastructure, so that data retrieval requests do not have to flow all the way to the back-end database (the database of record) to be served. The not-so simple part of data caching is propagating back-end database updates in the cache server(s) in a timely and accurate manner. This can be done, and different organizations take different approaches with respect to data cache implementation. Some go the "roll your own" route, using in-house-developed code to keep mid-tier data stores in near-sync status relative to the back end database of record; other organizations utilize data caching solutions available from various vendors (examples include TigerLogic FastSOA from Raining Data and DataXtend CE from Progress Software). Whether implementation has been a matter of build or buy, organizations have seen dramatic improvements in data retrieval performance through the use of data caching. This speed benefit is particularly advantageous in an SOA environment, as it can more than offset the drag on transaction response times caused by the distinct functional layering, abstraction, and standards-based interfaces to application subsystems and to associated data that are hallmarks of an SOA.

So, embrace SOA for all the right reasons, and take advantage of message queuing and data caching technologies to give users what they want: speed.

Happy architecting!

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home