In-Depth

Q&A with Kevin Dick: Where is XML headed?

ADT's Jack Vaughan recently went to the source to uncover the path of enterprise XML in years to come. Kevin Dick, shown here, leads The Middletier Group, is long-time chair of the XML Web Services One Conference, and is the author of "XML: A Manager's Guide," which is due soon in a new edition.

Q: What has surprised you in the development of XML?
A: Back in 1999, there was the beginning of a split between XML for content and XML for data. Since XML is essentially a profile of SGML, it definitely came out of the content camp, but it was picked up to help solve a number of interoperability problems by people in the data camp.

On the data side, one of the things that people believed in 1999 that turned out to be correct was that XML would be a platform for defining standard data formats that parties could easily use to exchange information. If you look at that today, there's lots of interoperable XML. One of the things that caught me a little bit by surprise was the use of XML as a protocol and methods-wrapping technology at the lower layer. We [also] saw the explosion of SOAP, which caught me a little bit by surprise. I knew it was possible, but I didn't think it was going to be the commercial success it was.

Q: What splits have appeared among XML data handlers?
A: There's something of a split between the programming data camp and the database data camp -- the people who are worried about moving XML data in and out of application programs, and those who are worried about moving XML data in and out of databases. That's one of the reasons why I think XML schema turned out to be so complex, because you had to accommodate the requirements of the type systems and the programming models of both application programs and database programs.

There's also, as you know, sort of a split between the 'low-level guys' and the 'high-level guys' -- those who are worried about business agreements between two parties, and the people who are worried about data agreements. There's a separation between business analysts and programmers. And, as it turns out, in order to get a nice interoperable format between two companies, you have to have these people working together. It doesn't do you much good to have these well-defined business concepts that don't have the appropriate supporting data structures in your interoperability format. And it certainly doesn't help you to have nice interoperable data structures that don't have the appropriate relations to the business concepts you want to talk about.

There's definitely a complete difference in perspective. You sort of see this outgrowth in XML messaging; there's a split between people that have a lot of different terms for capturing the same concept. Some people call it fine-grained vs. coarse-grained messaging, I call it distributed object vs. EDI-type messaging. There is SOAP where you're invoking methods on objects vs. SOAP where you're sending big blocks of business data back and forth.

Unfortunately, those two perspectives have been grossly confused in the design of the XML messaging infrastructure. So they're trying to accommodate both of those perspectives. That's a laudable goal, because it's nice to have the same infrastructure do two jobs -- you only need to learn one thing. But this is why there's a lot of confusion about whether SOAP is replacing CORBA or EDI.

Q: We'd think that, if you were trying to replace CORBA with SOAP, the more things you add on, the less lightweight it would be.
A: This is, in fact, precisely what's happening. What we have right now is Web services that are completely isomorphic to the CORBA infrastructure. We have a wire protocol, we have an interface definition language, and we have a directory query language and a discovery language. Those are, respectively, SOAP for the wire protocol, WSDL for the interface definition language and UDDI for discovery and directory.

Q: Is that bad or good, or just one of those things?
A: It's just one of those things. No matter how hard you try to have a grand design, things always evolve. People started out with SOAP as a very lightweight model -- one of the reasons why SOAP is so good for so many things is because it doesn't actually do very much. It's a structure for doing other things. To dive down to the technical depths for a second, you have the body and the header separation. SOAP in and of itself doesn't define any header tags, or do any security or transactions or any of those things. All it does is allow you to send a message with parameters and a target to another computer. But it has this extensible header where you can sort of proprietarily extend SOAP with transactions, security, quality of service and all of those good things. And so that's made it useful for lots of different problems. But you have to pile on a bunch of extra stuff to actually make it feasible in a large-scale architecture.

In my experience, consulting for companies that are building XML-based architectures, the confusion between which type of messaging -- the distributed object flavor or the EDI flavor -- is a tremendous decision people have to make that they don't realize they have to make or that not all of them realize they have to make. There's actually quite an impassioned discussion in the distributed computing community over which of these styles is feasible. There are many people who believe that in the large-scale architectures, the distributed object model is inherently unworkable and that we should design all of our systems with more of an EDI bulk message flavor.

Q: That makes room for conflict.
A: Yes, it is very disruptive when you have the thought leaders saying we should use this messaging model, and the people who have the skills in the projects are just sort of naturally inclined toward the distributed object model. From the management perspective, it can be quite a difficult challenge to overcome that tension. And when you see vendors not making clean separations between these two styles of messaging, so that you have a messaging toolkit from a major application server or distributed computing infrastructure vendor that allows you to do either one and doesn't actually make you come out and say which one you are doing, you can end up with the worst of both worlds.

When you're designing and prototyping/building the initial version of a system, the impact of these distinctions is unclear; they make the biggest difference when you're under high load and your scalability requirements start to become crucial. You don't actually see the impact of a decision you have to make very early on in the development cycle until very late in the deployment cycle.

This is the argument that is generally made for using the EDI message-type architecture over distributed objects -- it's far more scalable. I'm not saying it actually is, I'm saying this is generally the argument that's made: That it's asynchronous vs. synchronous, for one, and that you generally have much less interaction between systems in a messaging architecture. You say 'Here's a big block of data, go do something with it' as opposed to 'Create a new customer, give that customer an address, give that customer a name, now start adding line items to their order' and so on.

The distributed model is generally a lot more chatty. And it's a lot more convenient because as a programmer you can take smaller steps in figuring out what the other guy is doing. But then when you start running [this on large-scale systems] -- I'm not sure we have enough large deployments to say this has been confirmed -- but the theory is that it will break down under the high load.

Q: I know something of interest to you is orchestration. What is it?
A: The problem that orchestration is attempting to solve is that we can define this nice extensible wire protocol and this very flexible interface definition framework, but how do I, as a programmer, take advantage of all of the possible services that are out there? Instead of programming something that's going to hook it all together, I'm simply going to describe the relationship among the different services and have a complicated [piece of] software interpret that description and make everything work. It's similar to a rules engine in that it's a runtime interpreter. The runtime engine is a piece of software that sits on a server and interprets that description. That's actually the piece of software that takes all the appropriate action.

Q: How good have vendors been at adhering to standards in XML?
A: There's good news and bad news here with regard to vendors and standards. The good news is that almost all of the vendors have been surprisingly good citizens compared to behavior with regard to previous standards efforts like CORBA. Even people like Microsoft, who seem to have a reputation for wanting to define their own standards, have been really good citizens in working together on some of the basic standards. I think that within the next year or so you are going to have a lot of highly productive packages of software and development tools from many vendors that are pretty much interoperable. Which is a great step forward. And a lot of people have put their disagreements aside or at least agreed to a fair fight in the standards bodies and to abide by the decisions of the standards bodies. So that's the good news. We are in a much better situation than we have been in the past.

Now, the bad news is that this has created a temptation for vendors to create more standards, whether we necessarily need them or not. And this isn't a malicious or underhanded thing, it is just that vendors want to naturally advance the 'state of the art' because innovation is their value-add to the whole information technology equation. They can hire people, 'innovate' and then they want to reap some of the economic benefits of that innovation. But for the people who actually have to deploy these things in enterprises, they would like each innovation to clear some minimal bar of advancement over the last-generation of technology. Any new technology is disruptive to development and deployment; and as the Internet infrastructure of both hardware and software gets older, making changes to it is more and more likely to cause problems.

Q: What do you see ahead for native XML data stores as opposed to relying on relational methods?
A: This is yet another wonderful point of confusion for enterprises to deal with. This was actually one of the parts of the second edition of my book that ended up being substantially different. I basically categorize XML data into three categories. The first is interchange data. You and I want to send data back and forth to each other and I have this mapping layer that takes various parts of my relational database and then sends it to you, and your mapping layer maps it into your relational database. Well, obviously, the whole goal of interchange is to get data from somebody else's format into a format you already understand. In the case of interchange data, you are going to store XML in a relational database or a hierarchical database, whatever your back-end system uses.

The second category of data is content. We should treat a lot of the data as the content, right? This is information for data sheets and Web sites, and all of that is going to go into a content management system. That is what content management systems are for.

Now there is a final category of XML data -- what I would call operational XML data -- where you are actually using XML data in its native form to drive the behavior or execution of a system. Orchestration is a wonderful example of this. If I have a definition in XML of how this thing should behave. I've got to be able to get to that piece of data very fast. XSLT, to a certain extent, meets this definition as well. Or if I have a purchase order for an amount, and it is going to start kicking off authorization procedures and maybe some manufacturing, I'd better make sure I have that document as-is, so there is no confusion over what was sent.

So in cases where XML is the operational data, this is where you have a real choice. In the other cases, the best rule is to never choose anything other than a well-established infrastructure for doing those things. As far as things go for storing [XML] operational data, you've actually got a lot of choices. You could use a native XML store, or an object database with an XML layer on top of it because they are optimized in similar ways to the native XML stores. Or you could use a highly optimized subsystem of a relational database for dealing with precisely this type of data.