In-Depth

Web services architectures: Easier said than done

Web services can provide an open and interoperable development framework, but the technology requires an architecture unlike anything built before.

Every few years a new technology bursts upon the scene with great fanfare, promising to cure world hunger and fill the hole in the ozone layer. Artificial intelligence, object-oriented programming, data warehousing and data mining, the Internet, Java and XML have each been touted as savior. Initially, vendors introduce what they claim is the ''next big thing,'' seeking to generate as much excitement as possible, and the publicity engines dutifully join in. Early adopters flock to the new technology with great expectations, only to become disillusioned as faith collides with reality. A backlash inevitably ensues and soon every mention of the would-be panacea technology evokes denigration. For the lucky few, their real potential and limitations are eventually understood and they settle into a suitable niche, although many inevitably slide into oblivion.

With all that said, it is fair to say this is the era of Web services hype. As a marketing phenomenon, the Web services hype is riding on a wave of XML hype, which in turn is helping to promote Microsoft's .NET hype. From inter-enterprise integration to software as a service, to whatever else suits your fancy, you can find someone promoting Web services as the ideal solution. As you might have guessed, I believe the truth lies somewhere in between. While it is not a panacea, Web services technology does provide powerful solutions to a class of problems that involve multiple cooperating agents working together to solve business problems.

Web services defined
If you ask several technologists how they define Web services, you are likely to get more answers than there are respondents. But you will find some common elements in their answers, such as a heavy reliance on XML, and the use of TCP/IP-based standard protocols as the transport medium to enable distributed computing agents to collaborate with each other. Another key attribute that is often left out is that each Web service has a unique and standard address. Together, these features allow any consumer of a Web service to access it without having to worry about communication protocols, message formats and the like. The innovation in Web services is not in providing a framework for building distributed applications (indeed we have many worthy precedents for that), but in providing an open and highly interoperable framework. Like the Web and Internet e-mail, Web services enable providers and consumers of services to be completely independent of one another's implementation technology and platforms. The major weakness of earlier frameworks such as DCE, CORBA, DCOM and RMI, was that they could only operate in their own environment.

Of course, this flexibility comes with a price. The important characteristics of the earlier frameworks that differentiate them from Web services is the specification of a proprietary transport protocol and a proprietary data transfer format. Web services ride on a pre-existing and ubiquitous transport protocol, and utilize the ubiquitous XML as the data transfer format. By using proprietary protocols and data transfer formats, a distributed application framework is able to customize and tune itself to achieve high performance and build in a higher level of services and richer semantics. Features like security and minimizing network communications can be designed into the protocol itself.

Web services, however, do not have control over these elements and can only assume a least-common denominator of features from the allowable open protocols. This implies that the overall structure will be less well-tuned and several essential services have to be provided as additional higher-level protocols.

At the risk of repeating a well-worn cliche, the good thing about standards is that there are so many to choose from. Web services technology is no exception. There are three main contenders for the core protocol of Web services -- the most familiar are SOAP; its simpler progenitor, XML-RPC; and REST (a term that describes the architecture of the Web). REST uses HTTP as the transport protocol and provides a limited set of verbs (the standard set that comes with HTTP, such as GET, POST, PUT) that operate on an arbitrary set of nouns. XML-RPC also uses HTTP as its transport protocol and specifies a procedure call-value return standard using XML as the data transfer protocol. SOAP does not specify any particular transport protocol, although HTTP is the most common, with SMTP poised to become another popular option. It also allows for great flexibility in the types of messages that can be communicated, which are not restricted to a strictly procedure-call value return model. SOAP is also accompanied by an alphabet soup of auxiliary protocols (WSDL, UDDI, WS-Security, SAML, ebXML and WSCI, among others) that provide additional levels of functionality.

A detailed description of these protocols and a comparison between them is a topic for another article. This article focuses mostly on SOAP, because it is the mind-share leader.

Top 13 architectural guidelines for designing Web services
The technologies that comprise Web services entail a paradigm shift. When architecting distributed apps based on them, a new set of design guidelines needs to be followed. Web services leverage existing transport protocols and data transfer standards. For the most part, these standards were not originally designed to support general distributed applications. This gives rise to some key characteristics of Web services-based apps:

* Loose coupling between partners;
* High overhead per interaction;
* Standardization of interaction protocol, but not content; and
* Fundamental services, such as security, directory and messaging are outside the scope of SOAP, and must be specified separately through additional related standards.

The distributed and multivendor nature of Web services-based applications requires a foundation based on a well-thought-out architecture. However, the principles underlying architecting Web services-based applications are significantly different from typical n-tier applications. The following salient points present some practical guidelines for building applications based on this new paradigm.

1. Select the appropriate Web services protocol. While the focus of this article is SOAP, SOAP is not the ideal choice in all cases. The W3C provides an XSLT transformation service. You simply construct a URL that addresses the transformation service site, include the URLs of the source XML document and the XSL style sheet in the URL, and the transformation service returns the transformed document (try it at www.w3.org/2000/ 06/webdata/xslt?xslfile=&xmlfile= URL of XMLfile>&transform=Submit).

This is an example of the REST protocol in action. For such a simple service, you do not need the complexity of SOAP and the associated infrastructure. For more complex types of interactions, however, you will need the support provided by XML-RPC or even SOAP. The important point is to match the Web services protocol to the needs of the application.

2. Keep the server stateless. Despite its name, SOAP is an RPC protocol and is not object oriented. There is no notion of the identity of the agent providing the service within the protocol itself. If you want the ability to send a sequence of messages to the same object on the server, with each message possibly altering the state of the server object, you have to define a higher-level protocol and the appropriate facility on each end to track object identity. This can increase the application's complexity as well as the potential for error. However, there is also a performance- and flexibility-related issue -- a stateless server is inherently easier to scale and make fault-tolerant by adding redundancy.

This is not to say that applications that involve multiple interactions between agents should not be developed using Web services, but that doing so requires careful analysis. Not all applications need to scale to millions of interactions per day, but some may need to implement more complex interaction scenarios. For such applications, maintaining the state of the conversation is required. There are a number of specifications in various stages of development that aim to address the different aspects of interaction scenarios spanning multiple interactions, such as ebXML, Web Services Choreography Interface (WSCI), Business Process Modeling Language (BPML), Web Services Flow Language (WSFL) and XLANG, but the jury is still out on which of these will become the standard, or if we are doomed to have to cope with multiple standards in this space.

3. Keep an eye on messaging overhead. In general, the network communications requirements define the largest bottleneck in a distributed application. This is even more true with SOAP because of HTTP's stateless nature and XML's verbosity. The preferred approach is to design the communication between the consumer and the producer of your services so that all related information is transmitted in one message, rather than in many fine-grained messages.

For example, suppose you are designing a Web services-based solution that allows a travel agency to collect and display room availability from a range of hotels. The client provides some search criteria and you show them a list of the matching properties. The user can opt to get more details on an individual property before making a selection. Once the client makes a selection, a reservation is made. But should the application be designed so that the first call returns all details about the matching properties, or should they be retrieved only when the user demands to see the additional details? The wrong answer here can cost you a lot in performance.

If a search returns a large number of properties, then returning the details on all of them could be a drain on performance, especially since the user is only likely to ask for details on a small subset of them. In this case, a two-message protocol is likely to be better. But suppose that for each property there are multiple levels of detail that you display to the user. At the first level, you display only the description of the available matching rooms. At the next level, you also display the facilities available at the hotel. A user might also want to see the recreational facilities and sightseeing venues in the vicinity. It is preferable to retrieve these details in one shot even if you display them in tiers and only on demand. In each case, a careful analysis of the messaging overhead relative to the data transferred should be performed.

4. Choose the appropriate implementation layer. The Web services provider can live in many different architectural layers of the application, so there are several factors to consider. Are you providing an interface to existing legacy back-end apps? Are you trying to leverage an existing presentation layer app? (In the J2EE terminology, you can design your Web services to expose the JSP/servlet layer, the session bean layer or the entity bean layer. In more generic terms, this corresponds to Web services in the presentation layer, the business logic layer or the persistence layer.)

In general, the deeper the architectural layer you expose through your Web service, the more you sacrifice scalability, the greater the security requirements are, and the deeper the dependency between your Web service and the consumers becomes. At the same time, you also provide consumers with greater control over how they use your application's services. This is appropriate if the consumers are trusted partners and the risks of the greater exposure are outweighed by the benefits.

5. Choose the right toolkit. Application server vendor-provided toolkits are easier to integrate with the corresponding application server infrastructure, but they also tend to lock you in to that particular vendor. JAX pack has promise, but with many missing pieces it is not yet a mature specification. Apache SOAP makes the development of SOAP-based services easier in a vendor-neutral API. The upcoming Apache Axis API is a re-architecture of Apache SOAP and a dramatic improvement. It promises to make developing Web services easier and substantially hides the complexities of the underlying protocols. Among proprietary toolkits, BEA WebLogic Workshop and Microsoft's VS .NET raise the ease of building Web services to a new level.

The different toolkits also provide various abstractions of Web services and even implement different subsets of the SOAP specification. It is therefore important to ensure that the toolkit you choose implements the features of the SOAP specification you need at the proper level of abstraction.

6. Do not generate the SOAP interface from implementations (VS .NET syndrome). VS .NET greatly simplifies the creation of Web services. One simply adds a Web method declaration to a method and VS .NET generates all of the code necessary to expose it as a Web service. Yet this methodology defeats the whole purpose of Web services, which is to hide the implementation of a service completely behind an XML-based interface. VS .NET generates the interface from the implementation. While it is tempting to do so, avoid creating Web services as an afterthought in VS .NET.

7. Design for interoperability. In theory, Web services are about interoperability. But in reality, there are several reasons why interoperability might be an issue. Typically, you will use a SOAP toolkit to create and consume Web services. Your chosen toolkit might only implement a subset of the specification, making it incompatible with consumers or providers created with another toolkit. Additionally, the SOAP specification leaves some details optional and up to the implementation. Different toolkits can legitimately choose to ignore these features. The different toolkit implementers might interpret the specification differently if the language is ambiguous. Finally, the toolkit might simply have bugs in its implementation. All of this can lead to interoperability nightmares.

Of course, life is simple if all producers and consumers can be created using the same version of the toolkit. You might not be so lucky. In any event, it is always prudent to ensure that your services work well in a heterogeneous environment. The best approach is to test, test and test your services against various toolkits to ensure they work correctly. You can also participate in the SOAPBuilders online forum (http://groups.yahoo.com/group/soapbuilders) to discuss these issues and get help from other developers.

8. Plan on implementing security manually. SOAP does not address security, except for allowing SSL (HTTPS) as a transmission protocol. There are a number of additional solutions being proposed and developed, but no universally accepted standards have emerged. SSL allows for digital certificate-based authentication and encrypted secure end-to-end communications. This requires a PKI infrastructure to be in place. Other means of authentication need to be implemented manually, at least in the short term.

9. Decide between asynchronous and synchronous services. Most RPC mechanisms are designed to handle synchronous interactions. However, it is often necessary to engage in asynchronous interactions. The communication might not need a response (for example, a notification event), the processing of the request might take minutes or even days (for example, process and ship an order), or the response might need to be sent to a third party (for example, a workflow app). Message-based services allow for a very loosely coupled and robust design when real-time response is not required.

Note that the SOAP protocol is by nature synchronous and every request will generate a response. In a message-based architecture, the response is merely an acknowledgement that the message has been received and a response will be forthcoming (or not, as the case may be). Note also that the response does not necessarily need to be returned to the original requester but to a third party. Either the request itself can carry the address of the response recipient(s), or the sender and receiver might have established the recipient ahead of time. The response can also be sent to multiple recipients (for example, an order fulfillment service and an audit trail service). This can be used to build fairly complex workflow type scenarios. But do not get carried away -- this can become a debugger's nightmare.

10. Incorporate error handling. SOAP allows the server to return an indication of processing failure through FAULT elements in the response. While not as transparent as proper out-of-band exception handling, it is still a useful mechanism and should be used to build robust applications. A robust mechanism is to translate FAULT SOAP responses to Java exceptions on the client end, and to catch exceptions and return FAULT responses on the server end.

11. Compose services together to create more complex interactions. A single Web service can invoke other services to generate parts of the response, which it then consolidates into a single response for its client. While powerful, this technique has the potential to slow down the responses. If you are composing services provided by several different providers, debugging can be especially challenging.

There are several different topologies that can be built here. In a pipes and filters paradigm, each service does some processing and passes the request on to the next service for further work. As the services in the chain return, they combine the response from their callee with their own response until the original service consolidates all information and returns it back to the caller.

In a dispatch-based architecture, the original service acts as a broker and, based on the request, dispatches to one of several real services whose response it returns. Finally, a coordinator-based service calls upon one or more services to produce parts of the response and combines the responses into a single response for the caller. These models can also be combined. To the client, it all appears as a single service.

12. Design in testability. By now, you are probably excited and ready to start building Web services that will conquer the world. However, it is important to remember that testing a Web services-based app is a major challenge. Suppose you decided to build an application that invokes Web services provided by several partners. How do you test it? You cannot test against live services (as they might place real orders). Volume testing for scalability becomes even more challenging. It does not work to test against dummy services either, since then you are not testing the actual service.

For this reason, it is best to create a testing environment alongside the production environment. Your partners can then test against the test environment, which does everything exactly like the real service, but with simulated side-effects. Insist that partners provide such an environment to you as well. Given the lack of tools available to help you debug a distributed application based on Web services architecture, it is a good idea to incorporate extensive logging functionality within your services to help with the troubleshooting.

13. Choose your application judiciously. It should be clear by now that the technology is still in its infancy, and it is prudent to test the waters before plunging in head first. The most important use of the technology in the short term is as a framework for the integration of internal enterprise applications, potentially on multiple platforms. This approach is prudent for several reasons:

* It gives you control over the entire Web services app. You do not have to deal with unknown problems cropping up because of bugs in other people's services.
* Security issues are easier to deal with in an intranet environment.
* The benefits of integrating judiciously chosen internal apps can be great in terms of operational efficiencies; the resulting visibility can be very helpful in creating support to fund further projects.

While they do not introduce new concepts, Web services do introduce new possibilities by exploiting standard and vendor-neutral technologies. However, the characteristics of the base technologies require a new way of thinking about applications. Building Web services-based applications places a premium on a solid architectural foundation. Hopefully, these tips will help you to better exploit the technology in your environment.

About the Author

Mayank Prakash is CTO at Accelera Software, a Newton, Mass.-based consulting firm.