In-Depth

XML meets the data warehouse

On first view, the marriage of data ware- housing and Web services seems somewhat suspect -- like a Hollywood marriage that is not given much chance to last. Data warehouses are big and mission-critical; they are sturdy, almost heavy. Web services are more ephemeral, almost like applets; they are lightweight and loosely coupled. Thus, this marriage may be a mismatch. One, some would say, made in Market Hype Heaven.

But useful data management advances may come from this match. There may be something to the application of Web service techniques to BI problems, say several industry insiders. But, some suggest, it might be wise not to expect Web services to change this area overnight.

Coming to terms
What are Web services? They are still as much high-concept as they are technical reality. Underlying Web services is XML, a descriptive means of handling data that has gained wide industry support. When IBM, Microsoft and others agreed upon SOAP in 1999, the basic means for an XML-based, industry standard communication mechanism was put in place.

SOAP is a simple pipe, and its authors left room for enhancements. SOAP may or may not include an RPC. But it stands as an alternative to the tightly coupled computing protocols of the past. Moreover, it is posed as easier to program. XML is somewhat more readily understandable than lower-level languages. Closely associated with SOAP are WSDL and UDDI.

While easier integration is more than most application development managers can ever hope for, there is more to the Web services story. The idea of light, unfettered apps and data marts has infiltrated the world of business intelligence, data analytics and data warehousing.

Meanwhile, Microsoft and others have painted a picture of small, general apps that tap into a Web services architecture as needed. In data warehousing, this could lead to, for example, a calculating formula engine that slices data for analysis, or even pre-packaged multidimensional data cubes available on demand and on the Internet.

Some envision ''supply chains'' of data interchange. This can happen now, for example, between K-Mart and Proctor & Gamble. The P&G soap product line manager can get a feed on how a line of merchandise is doing in the Midwest, look at in-house inventory reads and cross-reference this to demographic databases, all to formulate a promotional campaign to run at K-Mart's Midwest stores. With Web services, theoretically, the data will come to the P&G power user's desktop with less need for drastic transformation; and, again, theoretically, a third-party demographic database or analytical algorithm could be briefly rented and immediately applied to the K-Mart store reports.

It happened one night
Why will this not happen overnight? XML may be a great lingua franca for integration, but the industry will not suddenly move all of its data into XML data stagers or integrating data hubs.

Only when a number of these are built will IT shops be able to gauge the ''speed hit'' that can come with XML descriptive (''verbose'' some would say) prowess. Already, makers of novel XML-specific-processing engines are readying hardware accelerators to push XML integration more effortlessly over speed bumps.

Speed gained here counts only to help system throughput on the middle tier and back end. What about the client side?

Analytic tool vendors have spent years creating and touting HTML, then JavaScript, then Java and DHTML interfaces for clients to do data analysis. They have tried to create intuitive data navigation schemes, and to run end-around plays to avoid cumbersome trips to the server and data sources.

The dirty little secret of the industry may be that power users want a robust client and a tight connection to data. If Web services in analytics succeed, new types of user interfaces may be needed.

Deeper, more specific standards must be built, beyond SOAP, for Web services to flourish. Underway are data analysis-oriented initiatives like XMLA and JOLAP, as well as many others, including vertical industry-specific formats.

The challenges to applying Web services within firewalls to data integration problems are one thing. Creating chains of services across the Web are another, say industry players close to this issue. UDDI, the ''yellow pages''-style directory for finding services, is still in its infancy. The special requirements of data warehouses must be met before XML and business intelligence truly becomes a marriage. Assuring the safe transmission of services and creating acceptable mechanisms for paying for services will take time.

These challenges are likely to temper unbridled enthusiasm for Web services in warehousing, not to dash it. This year should be significant. XMLA proponents are preparing to engage in public ''bake-offs'' that show interoperability of this protocol.

Why a warehouse?
What is the purpose of data warehousing today? ''To provide a unified view of something like customer information,'' said Aaron Zornes, executive VP, app delivery strategies at Meta Group. ''Today, it is mostly internal. And they use fixed tools going against a subset of a data warehouse.

''At some point, we have to decouple those tools,'' added Zornes. ''People don't want to hard-code proprietary tools to networks that hook into information from all over the world, or even internally. They want to hook in flexibly on the fly.''

Issues to overcome include XML throughput. ''Some say there's too much overhead for some 'real-time' data transformations,'' noted Zornes. ''But for batch transforms after an initial [intersystem] handshake, it may be adequate. If we are looking for information on a single customer, and aggregating and feeding back [to an analytical cube], that's a tremendous amount of overhead,'' he said.

''We are going to have to have a loosely coupled connection,'' he continued, ''so servers can recognize each other on the fly, rather than be hard-wired, and without having the [user interface] software installed on the computer.''

At the line of scrimmage
Among the data warehousing vendors to talk up the idea of Web services early last year was Ascential. ''We were the first data integration vendor to aggressively push a vision around Web services, and how it fits with data warehouse and operational data mart initiatives,'' said Bob Zurek, Ascential VP of advanced technology.

Ascential's plan to embed XML capabilities in its products does not force a customer to use those capabilities, Zurek said. Products are expected soon. ''It's straightforward: We're providing the functionality as a basic feature. Those who will use it are toward the innovative side,'' he said.

How will Ascential support these new capabilities, we asked?

''Today, if you use our product, you're reaching into all sorts of data silos. These could be relational databases or flat files,'' said Zurek. ''They've done some cleansing and then [staged it] for transform. For us, it doesn't matter what it is. We let you expose that process or job as a Web service. We have in the lab dozens of data sources exposed as Web services. You can expose your data to anything that can consume a Web service.''

It is still early, and views on how to constitute Web services vary. Vendors preceding or following Ascential with Web service-related data integration, data warehouse or BI announcements are many -- Actuate, Brio, Business Objects, ClearForest, Cognos, Crystal Decisions, Dimensional Insight, Hummingbird, Hyperion, IBI, IBM, Informatica, Microsoft, MicroStrategy, NCR, Oracle, Sagent, SAS Institute and Sybase among others.

''iWay uses Web services as a new kind of interface to access business processes,'' said Jake Freivald, director of marketing at iWay software, an IBI unit.

''We provide Web services as part of our XML transformation engine,'' he said. And the iWay engine (which evolved from EDA/SQL) can support extensions like ebXML because of its support of Web services. ''We can handle ebXML using Web services,'' Freivald added.

Meanwhile, Actuate supports Web services as part of Actuate Version 6. ''We come from the application development space, unlike the data warehouse and OLAP [vendors],'' said Nobby Akiha, Actuate's VP of marketing. ''We were able to build a SOAP-based API from scratch. This lets us leverage existing technology. Web services expose 100% of the report server,'' he said, which eases adding new reports, scheduling reports and conducting administrative functions.

In 2000, Sagent created a Web services layer on top of the Sagent platform that can use SOAP messaging for tasks like adding and deleting users across multiple platforms and sites.

For its part, Cognos last summer announced Cognos Web services, supporting XML, SOAP and WSDL to build connections between the Cognos suite and multiple apps. Meanwhile, Business Objects incorporated Web services into the BusinessObject Developer Suite.

A clearer picture
''At first I wasn't convinced there was any truly useful application for Web services in business intelligence or data warehousing,'' admitted Sanju Bansal, vice chairman and COO, MicroStrategy. ''But in the last year to 18 months, as I visit clients, I'm getting a clear picture of how they want to use it with their BI systems.''

Bansal described a scenario in which a retailer's sales data is made available to suppliers, and Web service calls are used to interrogate the retailer's database using the supplier's own pricing algorithms.

People want historical data warehouse and up-to-the-minute operational data feeding into pricing applications, he said.

''But nobody wants to rebuild a pricing application to make the data warehouse happy,'' chided Bansal.

''People want to connect applications [within organizations], and they want the connection to be automatic. My prediction: That is where the real value of Web services applications will be,'' he said. These in-house applications will be more valuable, he suggested, than even the sizable advances Web services might make in supply-chain partner connections.

Bansal noted that his firm moved forward on the Web services front this year with MicroStrategy Web Universal. The product marks the company's foray into Unix. It runs on J2EE servers, including BEA's WebLogic Server. Web Universal APIs are exposed and accessible via XML.

Although upbeat on Web services, Bansal is less than a champion of present XML-oriented analytical standard proposals. ''XML for OLAP' is not taking off in the industry,' he asserted.

Moreover, he said, ''JOLAP has not caught on.'' A key standard, he said, is a de facto standard: Microsoft's Multidimensional Extensions (MDX), which can be enhanced with recent XML-oriented add-ons to Microsoft's SQL database.

The elite and the masses
As with many advances in Web-enabling data analytics, Web services may broaden the casual user base of such tools, without particularly easing the task of analysts who work with such tools every day.

''The experts are always going to say 'It's not as powerful, it's not as fast,''' said Ragnar Edholm, director of Essbase tools at Hyperion. ''Those are the power users.

''But you should also listen to the people that never tried to do this before,'' added Edholm. A major driver for both XML and Web services in data integration apps is the accelerating effort of relational database vendors to add better XML data handling traits to their products.

''I think vendors are moving faster than users,'' noted Edholm.

Edholm pointed to the OMG's Common Warehouse Metamodel (CWM) as a useful XML-oriented standard for data integration. In fact, the JOLAP API criticized by MicroStrategy's Bansal and others relies on CWM to some extent for OLAP service interfacing.

JOLAP is a significant continuation in the definition of industry standards for software interoperability, according to John Kopcke, CTO at Hyperion, which has taken a leading role in creating JOLAP.

Hyperion has also been active in the effort to create XMLA, which was recently published as an updated spec and API standard for vendors to access multidimensional databases as a Web service. Other council members are Microsoft, Crystal Decisions, SAP AG and Silvon Software.

Overhead, overheard
Also active on XMLA standards is SAS Institute. ''We view it as having great promise,'' said Jim Metcalf, the firm's director of foundation technology strategy, while noting recent progress in preliminary interoperability tests between XMLA-enabled systems. ''We are watching JOLAP unfold,'' he said.

''Web services hold promise, as an 'interoperability' standard,''' said Metcalf, emphasizing that one should put ''interoperability'' in quotes.

''It's a fancy API and, if everyone can agree, we'll have interapplication communication over the Internet. We still have a long way to go,'' said Metcalf.

''Before Web services can really be effectively used, especially outside the firewall, we have to get a robust security model in place,'' explained Metcalf.

XML overhead may be an issue, he said. Also an issue: How people package their services. ''It's not binary. It's 'text strings' with lots of markup,'' he noted. Things could tend to break, Metcalf indicated, ''if people stretch it to its limits.''

The history of computing, he added, is one of people chasing bottlenecks. And when they fix one, another one will pop up elsewhere.

Warehouse insulation
''Vendors will use Web services as an interface,'' said Wayne Eckerson, director of education and research, The Data Warehousing Institute.

''From the vendor's viewpoint, the promise is that we will have true Internet-based services that people can dial into; and people will be able to find those services from a variety of folks in a variety of categories as easily as in the [telephone] Yellow Pages,'' said Eckerson.

Some will be fee-based, others will be free. Some will reside on extranets, some will be internal. ''We won't get there for a long time because, although those services exist today, they are now delivered in proprietary format,'' said Eckerson.

Firms have already made progress in moving toward XML-enabled data architectures, he noted.

''I've seen customers use it to insulate layers of the their BI architecture so they can plug-and-play components or swap out components as necessary,'' he said.

But, Eckerson added, people will be very cautious about opening up their systems to the world. Overall, standards bear watching, Eckerson said, noting that XMLA was definitely becoming a reality.

It may be said that SOAP -- the original driver of the notion of XML-based Web services -- obtained its simplicity by overlooking niggling details. But the idea has a hold on many in the data warehousing segment. Filling in the details, however, will be an ongoing effort. Stay tuned.

Includes reporting by Mike Bucken

See the representative product listing ''Snapshot of BI tools supporting Web services''