In-Depth

Meta is the word

Related stories:

The i gets bigger at Oracle
Lane Hiring Is Key
Confused Tools Strategy?
Oracle's long and winding repository road
Monash on Oracle
On top of the RDB mountain
Ellison leads Oracle in his own way

The basic definition of meta data is simple -- 'Data about data.' With definitions out of the way, however, the subject quickly becomes more complex. The reason: Meta data means different things to different people in different industries. Particularly, the meta data view of the object developer is quite different than that of the data warehouse builder. But a 'unified,' all encompassing meta data format, sometimes seemingly the goal of meta data repository standards, is probably not going to appear any time soon, say experienced industry practitioners.

Still, standards would help, and the data warehouse segment may have taken a march in this regard. It is a best-of-breed business in which standard models are increasingly critical. Traditional data expert Oracle Corp. is working on such a standard, with something of an eye toward broader meta data standard solutions as well. Meanwhile, the meta data standard created by powerful industry force and relative data upstart Microsoft Corp. seems further ahead in the standards race. Just last month, the Open Information Model (OIM) based on Microsoft technology was accepted as a standard by the Meta Data Coalition (MDC), a standards coalition of software and services companies.

Microsoft's repository efforts actually go back a number of years. The company initially worked with Texas Instruments Software (now part of Sterling Software) and other software players to define its repository. More recently, it has worked closely with Rational Software and Platinum Technology (now part of Computer Associates) to complete the work. As Microsoft has increased its presence in the database business, and more recently launched an online analytical processing (OLAP) product line, its repository efforts have come to include OIM. "Microsoft went after the architecture," said David Marco, president of Enterprise Warehousing Solutions Inc., Palos Hills, Ill. "Architecture is tough. But they brought in all the data warehouse majors. Now the Meta Data Coalition owns those Microsoft OIM models," he added. In a way, OIM works as the model store, while the Microsoft repository acts as the meta data engine.

Of course, users and viewers alike take standards efforts with a grain of salt. While a meta data standard may yet emerge as a universal model embraced by all software developers everywhere, you may be hard-pressed to find true believers among the practitioners getting their hands dirty building meta data repositories.

Like other technologies, meta data repository plans are influenced by the emergent eXtensible Markup Language (XML) standard for data description. Besides representing data types in ways specific to specific industries, XML may also have use as a meta data interchange format between distinct meta data repositories. The potential for distributed architectures, rather than centralized repositories with their spotty record, is enough to give some pause.

In fact, the specification Microsoft and its partners developed through the Meta Data Coalition, and the competing and as-yet unpublished data warehousing standard created by Oracle and partners including IBM, still seems pretty academic to developers of meta data repositories. For Microsoft, it has become something of a competitive issue. The SQL Server database has been a primary part of Microsoft's push for continued growth. The market target is therefore relational leader Oracle. One niche that Microsoft has targeted is the growing area of OLAP and data warehousing. As part of its push, the company went after the data architecture -- or infrastructure -- which led to an effort with partners and the Meta Data Coalition to develop a standard way of dealing with meta data.

The view that Microsoft is suddenly far ahead of the competition may be tempered by advisories from Meta Group Analyst (and OLAP and warehouse maven) Aaron Zornes, who recently said that many users could be said to be waiting to see what OIM really is. A clear disposition of the status of the Platinum Technology repository (an important element of the Microsoft repository), said Zornes, is also awaited from Computer Associates. "Is OIM ready for prime time? Have people deployed it?" he asks. "It's the best effort to date," said Zornes, answering question one. "I haven't seen it deployed yet," he added, answering question two.

Does meta matter?

Why bother with meta data at all? Without a meta data standard, developers are forced to write interfaces between best-of-breed tools available in larger database, data warehousing and data mart markets. With one standard, this time-consuming task goes away. When you talk to developers who are actually building meta data repositories, the skepticism about standards borders on cynicism. They have heard big announcements of universal standards before and have seen them evaporate. Some developers of meta data repositories hold out hope for the eventual industry-wide adoption of one of two competing standards being marketed by Microsoft and Oracle, but there is a bit of disparagement.

While some industry analysts suggest the Y2K problem has convinced the developer community of the need for standard models for applications and databases -- in Y2K remediation, much time was lost figuring out what a corporation's data was about -- one expert pronounced a pox on both the Microsoft and Oracle houses, noting "We'll be fixing the Year 3000 problem before there is a meta data standard."

Other impressions are more measured, but still laced with cynicism. "My feeling on the standards is a little cynical," admits Brenda Moncla, senior national data warehouse consultant at Chicago-based Sysix Technologies. "My cynicism comes from being a practitioner in this environment for a very long time, and having worked for a software product company. I understand the claims, politics, vendor alliances and things like that. And I think what we tend to overlook is how big and how complex this problem is.

"One of the big issues is that all meta data is not created equal," she added. "Therefore, it is very difficult to trace an element from our operational environment through our DSS [decision support system] environment using meta data in an easy and intuitive fashion."

Moncla says vendors of meta data tools have not yet gotten their acts together. "Then we come to the tool vendors and products that are actually considered meta data tools," she said. "We are talking typically about repositories and meta models and mechanisms for moving data in and out. Do we have a standard for the manner for which we store meta data, or the form and format in which we store meta data? Do we have a standard for the form and format in which we interchange meta data? Those are two different philosophies and we've seen different vendors adopt those philosophies over some period of time. They've either said, 'Well, gosh, we all need to strive to make the meta data fit a particular physical model. Or [other vendors say] it doesn't matter what kind of physical model you store it in, as long as we can all speak English when we exchange it.'"

This can be rephrased in a question: Should developers work with a version of the meta data models Microsoft and Oracle advocate, or will it be enough just to accept new-on-the-scene XML as an appropriate lingua franca? Some observers refer to XML as the ASCII of data.

Moncla is of the opinion that when you get to the nitty gritty of a client wanting to sell products over the Internet, the meta data model must be pretty specific as to what users need. Among developers, Moncla is not alone in her skepticism about a grand, unified field theory for meta data.

"I've found that meta data is different depending on which client you go to," said Russ Burry, who works in the data warehousing practice of Keane Inc., a Boston-based consulting firm.

This view is shared by John Spiers, formerly of Open Meaning, a meta data integration tools company acquired by Oracle last year. Spiers now works in the EAI business unit of Forté Software Inc., Oakland, Calif. He believes that rather than adopting a grand, unified theory, specific meta data standards will emerge to meet vertical industry needs.

"I would say the world of meta data is much more toward defining limited vertical market sets as opposed to a much greater framework for modeling everything that I could ever encounter in my environment," he said.

Spiers sees the real action in standards coming in the form of XML Data Type Definitions (DTDs), which basically define shared data structures to be used across applications and organizations. Specific industries, such as healthcare and telecommunications, are setting standard DTDs so that organizations can exchange data within their specific industry. This movement is motivated by the growing trend toward moving business-to-business transactions to the Internet.

Taking an even more micro view of a meta data standard, Spiers believes XML may become a universal language for data integration. "As people look to integrate loosely coupled systems, they seem to be looking increasingly toward XML to provide the shared data representation," said Spiers. "What XML promises to do is to become the mechanism for sharing information between loosely coupled applications over Web-styled networks. And the important thing about XML is that it is basically a meta language. It's a meta language in which you can define any data structure you want, and within which you can represent both data and meta data."

Up and humming

Bucking the skepticism and cynicism surrounding the development of a universally accepted meta data model is Enterprise Warehousing Solutions' Marco. While agreeing that there is currently no standard model for meta data repositories and the applications that use them, he sees one coming -- and soon.

"Certainly within a year, I think maybe closer to six months, you're going to see some repositories up and humming on the Microsoft/Meta Data Coalition standard," predicts Marco. One of the driving forces behind this coming standardization is the production gains overtasked IT departments stand to gain, said Marco, who is himself in the business of building meta data repositories.

As Marco sees it: "I could just plug into this standard model. It's just two interfaces for me to write: one to send data to the model, [and] one for me to pull the information I need. You have gone from a hundred interfaces at the largest end, to two. From a software development standpoint, it is a great idea."

He also sees it as a positive for those clients who need meta data repositories, but want to continue to benefit from having a best-of-breed market for tools. Rather than creating a Tower of Babel, all of the available tools could work together as long as software vendors embrace the standard.

IT departments in major corporations would also gain; if meta data standards are adhered to in applications development, it would be easier for new hires in the glass house to maintain these systems. It would also eliminate some of the "quirkiness" built into many legacy COBOL applications that has come back to haunt Y2K teams as they sort out how a long dead or retired programmer defined dates.

Marco believes the enormous cost of fixing Y2K has been a wakeup call to many organizations that now realize that had data standards been in place in 1970, the century date change problem would have been easier to fix.

A meta data standard would therefore have a lot of advantages, on a number of levels. And yet, there still is no accepted standard. The major obstacle now to the adoption of a standard, as Marco views it, is yet another grudge match between Bill Gates and Larry Ellison.

As Marco sees the competition in the real world, the battle over meta data standards is partially due to the Microsoft strategy of moving into and taking over markets they do not yet own. Microsoft wants SQL Server to be the dominant enterprise database. But in Marco's view, SQL Server is not dominant either technologically or in market share. Oracle has the better database for enterprise applications. But examples of Microsoft succeeding in new markets are plentiful.

Microsoft is enticing IT departments to buy SQL Server by offering a free meta data repository based on the OIM. Since people like getting things for free, it is a marketing strategy that could worry Oracle. Or as Marco whimsically puts it: "You're Oracle and you're the 600-pound gorilla in the database world. But you look up and see that the 1,000-pound gorilla has just entered your jungle."

For many database developers and administrators, implementing repositories is too daunting a task today. While the Microsoft repository may be viewed as simplified by some, its use of some object-oriented development methods (read: COM) and the Unified Modeling Language (UML) can make it less than intuitive to the uninitiated.

There are shortcomings to the Microsoft standard, Marco said, but he predicts it will begin to be used in meta data repository development within the next six months. He notes that COM is being removed from repository models, though not from the Microsoft Repository. In fact, COM is not all that familiar to a lot of application developers much less database developers.

"I can talk to you all day about what the Microsoft repository doesn't do because I have a product to look at," Marco said. "With Oracle, I don't have a product to look at. Certainly, both companies have a ways to go. But the Microsoft standard has had more input from everybody."

Playing catch up

In order to catch up, Oracle has acquired One Meaning, a meta data tools and standards technology company. It then partnered with IBM and began working on a standard it plans to offer the OMG in September.

Marco leans toward the Microsoft OIM standard, because while he does not believe very many, if any, developers are using it at the moment, it is available with SQL Server.

The reason you can find fault with OIM, he said, is that it is published and available on the Meta Data Coalition Web site. The Oracle model, on the other hand, is not-yet published and the tools that will incorporate it are still in what Oracle calls internal beta testing.

Understanding the status of the Oracle repository effort requires some background. In mid-1998, Oracle outlined its future meta data repository strategy. Details and timetables are still awaited at the time of this writing. Early in 1999, Oracle joined with IBM and Unisys to promote the XML Meta Data Interchange (XMI) format within the Object Management Group. At the time, XMI's purpose was described as providing a vendor- and middleware-neutral open interface format for meta data in distributed environments. The plan was to start with modeling and programming meta data and to expand to data warehouse, component and other data types. Some viewers suggest that Oracle's efforts in the data warehouse realm -- which may use the banner of the Common Warehouse Metadata (CWM) model -- have been complicated by its efforts to support a more encompassing meta data effort. Clearly, Oracle wants to expand its XMI efforts into data warehouse formats.

What's next?

But what happens when Oracle's model becomes available this fall or early next year? Does the industry end up with two incompatible models, and a prolonged marketing war between Microsoft and Oracle with the major casualty being the hope for a standard model? Oracle's view, as voiced by Michael Howard, the company's vice president of data warehousing, is that the Oracle CWM model will emerge from the war as the industry standard. He said Oracle has been able to leverage its dominant position in the enterprise database market to develop a standard based on input from its large customer base. Thus, CWM is designed to meet the real world needs of corporations.

Howard argues that Microsoft had limited access to information from both tool vendors and corporations because SQL Server is not an enterprise solution and nobody trusts Microsoft enough to share much information with them.

In Oracle's scenario, Microsoft will eventually build a bridge from OIM to CWM in the same way that it had to make some accommodation between ActiveX and Java.

However, Enterprise Warehousing Solutions' Marco is not so sure it won't be the other way around. "Microsoft has done some very good work in developing their standard with input from all its software vendor partners," he said.

It cannot be said that the two camps are not talking to one another. In April, the Meta Data Coalition and the Object Management Group announced a cooperative effort to develop meta data standards. They describe this as a "formal technical liaison." So the notion of détente, as opposed to wrestling, seems proper. Marco, for one, said détente may work. Rather than see the two software industry gorillas stage a Wrestlemania over meta data standards, Marco voices hope that Microsoft and Oracle can reach an accommodation. Could, perhaps, the best of the two models be merged into one grand, unified standard?