In-Depth

The changing face of repositories

Emerging industry standards are making distributed repositories a reality; the failed centralized model gives way to a more realistic strategy of using multiple smaller stores for specific functions.

If you haven't heard much about repositories in the past year or so, you are not alone. The standalone repositories of yesteryear never really took off for a number of reasons. For one thing, they were hard to build mainly because they spawned a complex network of relationships. Versioning was difficult to define, especially versions of relationships. In repositories that supported versioning, performance tended to suffer. And companies did not often realize a return on investment from repositories.

But while the notion of a generalized or classic repository may have fallen by the wayside, the need for a centralized location to store pertinent information has never gone away. "Centralized repositories didn't take hold because industry standards weren't mature and people building distributed apps tended to be in small teams scattered across the globe," explained Sridhar Iyengar, a Unisys fellow in software technology and architecture in Mission Viejo, Calif. The rise of industry standards has reduced the need for general-purpose repositories by making a distributed middleware framework possible.

"Repositories aren't dead," noted Brian Carroll, chief architect at Merant PVCS, Rockville, Md., just "more modest, perhaps more practical and hidden inside our development tools."

Jon Siegel, director of technology transfer for the Needham, Mass.-based OMG agrees. Repositories "are all around you in different guises," he noted. "They're there, but not necessarily in ways that you'll recognize."

So what does a modern repository look like? As Merant's Carroll notes, more and more repositories are hidden inside development tools these days. While they hold little merit on their own, they are much more appealing packaged with development tools.

While a generalized central repository—not linked to any specific function—lacks demand, "the concept of a repository as a central place you go to for information for a specific activity makes sense," said Steve Hendrick. Hendrick is vice president of application development and deployment research at IDC, Framingham, Mass. "More and more emphasis has been focused on virtually every aspect of the software development life cycle," he added. "The notion of what a repository provides is now a huge dimension of every functional activity. The definition of a repository, which started as a general place to go to for all information, has now become very specialized and very aligned with discrete activity that has to do with the development or deployment of software products."

Open interchange with XMI
Figure 1
The XML Metadata Interchange (XMI) standard allows multiple developers to share tools and technologies from a variety of vendors over the Internet.

Standards breed collaboration
As the industry embraces more of a collaborative effort in regard to repositories, the benefits increase for all those involved. Collaboration is a key ingredient the repositories of old lacked. Developers today are more interested in sharing information about assets for developing and describing information to make it available for sharing. For some, that means the Extensible Markup Language (XML). "We're going to a massively distributed repository with people agreeing on a standard way to share information in those repositories," noted Dan Hay, product manager for Visual Studio.NET at Microsoft Corp., Redmond, Wash.

And that is exactly where the market is now. New industry standards have emerged that help to make distributed repositories a reality. The two most relevant standards in this regard, according to Unisys' Iyengar, are the Meta-Object Facility (MOF) and the XML Metadata Interchange (XMI). Iyengar, who is on the architecture board of the OMG, helped to architect both specifications.

MOF is an open, CORBA-compliant framework for describing, representing and manipulating meta information (data describing data). It defines a meta model of an object-oriented application. The meta model is the basic concept a developer puts into a model and then uses to build an application. It can be an attribute, association, class or the like. All of these meta models can be kept in a repository; when they are combined, they become more useful. MOF, which uses the Unified Modeling Language (UML), makes them interoperable. (For more on MOF, see "Meta data conundrum carries on," ADT, October 2001, p. 49.)

XMI gives developers working with object technology the ability to exchange programming data over the Internet in a standardized way. This allows development teams using tools from multiple vendors to collaborate on applications, regardless of their location. It makes it possible to "have all applications across an enterprise accessing the same description of meta data sources," explained Michael Lang, founder and executive vice president of MetaMatrix, New York, N.Y. XMI makes interoperability a cinch.

A handful of repository builders have already jumped on the XMI bandwagon, and others are following suit. The recently released Version 2.0 of MetaMatrix's MetaBase Repository is both MOF-based and XMI-compliant. While most repositories in the market were built for the purpose of creating and maintaining data warehouses, MetaBase was created to serve as a meta data management system for information integration.

By storing meta data in an XMI format in the MetaBase repository, data can be consumed by other repositories that support XMI. The repository, according to the firm's Greg Milbank, senior VP of sales and business development, "is used to create a meta data abstraction middleware for the purposes of joining relationships between disparate sources for application integration."

Unisys Corp.'s Universal Repository (UREP), which was available before MetaBase, embraces the OMG's UML, MOF and XMI. The repository supports information sharing across development tools, which is made possible by an extensible, object-oriented meta model, versioning services, configuration management and team-based development approaches. Unisys stopped marketing its repository last year, however, because the market was fairly small and the number of customers ready for a global implementation was a minority, the firm's Iyengar said. Instead, Unisys has "OEMed" its repository to some independent software vendors while using the technology internally.

One firm taking advantage of Unisys's repository technology is London-based Adaptive Ltd., whose Adaptive Framework product suite is based on UREP. In fact, the Adaptive Repository was the first MOF-based repository on the market, released in early September. It was not released standalone, but rather as part of Adaptive Information Manager, which includes support for the OMG's Common Warehouse Metamodel (CWM), a meta model for data storage.

Other repository makers are following suit with MetaMatrix and Unisys in offering XMI-compliant repositories. They have to if they want to stay in business. According to Iyengar, unless companies open up their repositories and tools to these industry standards, their rate of success will be questionable because their repositories will not be interoperable, he said. "XML, XMI, MOF and JMI [Java Metadata Interchange] are going to be necessary ingredients in a repository," Iyengar noted.

Repository makers are taking notice. At press time, the newest release of the Enabler object repository from Softlab GmbH, Frankfurt, Germany, included additional functionality for XMI support. Through an architecture built on XMI, Softlab's specially developed workbench is able to support development with all leading XMI-compliant CASE tools.

Similarly, Naples, Fla.-based ASG (formerly Allen Systems Group) is adopting XMI as an important vehicle for meta data interchange and, in fact, sees it becoming the company's prime vehicle for meta data interchange, according to Ian Rowlands, director of product management. Version 6 of ASG's Rochade meta data repository includes XMI and CWM support.

IBM Corp. is using both XMI and MOF but is touting its solution not as a repository, but rather as an interchange mechanism. "More and more companies are using [the standards] as an interchange mechanism than a storage repository," noted Unisys's Iyengar. Among them is Sun Microsystems with its Forté development tools. Oracle Corp. is embracing MOF, JMI and XMI as part of its data warehouse data management infrastructure. Sun, Oracle and IBM are all building distributed meta data services into their products.

Computer Associates' Platinum Repository is not currently XMI-compliant. But the company is looking ahead and has plans to embrace the industry standard in a future release of the product.

Rational Software Corp. offers an XMI-compliant add-in to its Rational Rose visual modeling tool. The Rational Rose XMI Import/Export Facility allows Rational Rose users to make it a broader tool. Rational has also been working with IBM on a standard way to use XMI-based information. In addition, through a partnership with Unisys, developers can use Rational Rose to browse and reuse UREP services in designing repository-based applications.

A new standard?
And what about Microsoft? Where does it fit in with its Microsoft Repository? Application Development Trends reported just six months ago that Microsoft had shifted its repository strategy to use multiple repositories, each built for specific applications. That approach fits right in with other vendors' strategies in the shift toward distributed repositories.

But Microsoft is taking a slightly different tack than the others by focusing more on the Universal Description Discovery and Integration (UDDI) industry standard. Formed by IBM, Microsoft and Ariba Inc., Sunnyvale, Calif., UDDI serves as a repository for Web services, allowing e-businesses to define their business, discover other businesses and share information.

UDDI, while not based on a meta model, is an important step in building a distributed infrastructure in the Internet age, according to IDC's Hendrick. It serves as a method to catalog Web services and componentize things. "UDDI is not about how you build. It's more about how you identify that these services exist, how you describe them and how you enable people to use them," he said.

UDDI will gain more respect and popularity as the industry focuses more on Web services. Today, however, it still poses a number of challenges. For one thing, it creates an added level of complexity by enabling objects to interoperate in an inter-enterprise, possibly from one enterprise that does not have an understanding or visibility of objects in another enterprise.

Another challenge is establishing a common vocabulary for the communication UDDI enables. Web services and companies need to agree on the terms that are used. "Once the grammar is put into place," noted Hendrick, "the challenge will shift to having different groups within a population agree to a particular type of vocabulary for information. Everybody needs to know how to interpret what [a communication through UDDI] represents and how you use it." These and other challenges will be overcome as the Web services industry matures.

XMI simplified
Figure 2
XMI can be combined with multiple standards to provide developers with a common language for specifying, constructing and documenting distributed objects and business models.

Watch this space
Another area to watch is the OMG's Model Driven Architecture (MDA), which will play an important role in the migration to distributed repositories. MDA, built on UML, MOF, XMI and CWM, is a software architecture based at the model level. Because every year or two a new interoperability paradigm emerges, firms and enterprises conducting business over the Internet need to interoperate with myriad protocols to work with customers and suppliers.

"A company needs something that stays stable to cope with this constantly changing middleware environment—a stake in the ground that remains fixed while the software landscape around it changes," explained the OMG's Siegel.

MDA specifications are based on a platform-independent model (PIM) of an application's business functions. The PIM expresses an application's fundamental business functionality and behavior in a platform-independent way. MDA-enabled tools generate a platform-specific model (PSM) for one or more target platforms from the same base PIM and then, in a subsequent step, generate an implementation on that platform from the PSM. MDA-enabled tools can generate interoperable implementations on different platforms, allowing an enterprise to integrate its various departments and functions across platform boundaries.

MDA tools may cover more than one middleware platform. For example, an MDA tool that generates an implementation on Enterprise JavaBeans can still invoke another one on C# .NET. "All of the code to invoke other MDA applications across the platform ... is generated and maintained automatically by MDA-enabled development tools," Siegel said.

The PIM remains stable. "As a software environment evolves and continues to change, the PIM remains just as valuable, just as useful," added Siegel. When a company introduces new middleware, the enterprise takes the existing PIM and creates a PSM and application on a new platform.

As the repository market continues to evolve, repository function, although it is important, is becoming subordinate. Repositories are being packaged with development and data modeling tools and even being managed by development tools. They are becoming more specialized, but are not going away. They may be taking different forms, but they are here to stay.

"Repository function is very important, and the fact that it integrates and uses the MOF meta model is important in the market," noted Siegel.

Unisys' Iyengar concurs: "Because most [development] teams are global, distributed and need collaboration, centralized repositories are passé, but meta data servers, meta data interoperability and meta data interchange are hot." Keep watching for new developments.

Analysis: RDF emerges in meta matters
The concept of what software is—and of how a program is built—can change a few times between breakfast and lunch. That is why a valid meta data architecture is so important. It may make it possible for the night staff to pick up where the day staff left off.

While the notion of an all-purpose, central meta data repository for software development has faded from prominence, there are a number of lighter-weight architectures that may come to fruition. XML, which can embed Web pointers that lead to explanations of a programmer's intent, and UML, which is capable of more, but has at least become a useful notation and means of model communication, are two technologies that may advance software meta data.

Two standards groups are prominent here: the OMG, known for CORBA but also the keeper of UML, MOF and XMI; and the World Wide Web Consortium (W3C), known for HTML but also the prime advocate of XML and the Resource Description Framework (RDF). And aspects of these two efforts may yet meet in a marriage of some kind.

The idea of a central meta data pool is something of a relic of the CASE era, which has also faded. The Microsoft Repository, which was forged largely through partnerships with Platinum Technologies and Texas Instruments—neither firm now a player in the corporate software business—is little discussed these days. Actually, it was never much discussed. The Microsoft Repository (wisely) never endeavored to reshape the development world as IBM's ill-fated AD/Cycle did before it.

What is not clear at the moment is how the goal of better meta data will be carried forward. From the get-go, the W3C saw XML as a meta markup language that would do for data what HTML did for presentation on the Web. The W3C and its leader, Tim Berners-Lee, saw a fit between XML and RDF. In fact, Berners-Lee sees RDF-style processing as somewhat crucial to the Semantic Web concept he promotes as a more robust system than the original World Wide Web.

RDF acts as an XML encoding that supports a data model with node and relationship types. Developers within the XML community that are experimenting with RDF will probably be the first to see lightweight meta data in action.

RDF is similar to MOF in that it provides a generic object model. "Below it you have an underlying serialization layer, and above it you have RDF Schema, DAML and other modeling languages," said Sergey Melnik, research scholar in the Computer Science Department at Stanford University, and an invited expert in the W3C's RDFCore Working Group.

"MOF, in a way, corresponds to what RDF is trying to achieve. It gives you a generic way of representing objects and relationships between them," said Melnik. "MOF is used to represent UML models in a machine-processable way, while XMI works as a serialization layer for MOF."

The point is that, in both cases, a layered approach is used to build comprehensive "model stacks." Following the analogy, suggests Melnik, UML could be viewed as a powerful schema language.

"What I think is very promising [for] the future development of UML and the Semantic Web is [the potential] to situate the entire UML on top of RDF," said Melnik. "UML could benefit from 'RDF-izing' and interoperability with fledgling Semantic Web standards. In turn, the Semantic Web community could leverage an 'industrial-strength' schema language."

"XMI is a nice representation of what UML is," added Eric Miller, Semantic Web activity lead for the W3C. "Both XMI and UML are rich."

While the OMG and W3C have similar goals, added Miller, he noted that the W3C is working largely on relatively quick, lightweight technologies.

RDF-enabled XML may come into play as industry attempts to deploy so-called Web services, said Uche Obguji, principal consultant at XML specialist Fourthought. "RDF has facilities to provide semantic transparency to descriptions and object models," said Obguji. As well, he noted, there are a good number of new RDF tools that can help the Web services developer. Obguji, too, sees a fit between the W3C RDF and OMG XMI efforts.

—Jack Vaughan