At the DCI Data Warehousing Conference in October, six data warehouse vendors -- Arbor Software, Business Objects, Cognos, Evolutionary Technologies, Platinum Technology, and Texas Instruments -- announced a new initiative for defining specifications for the access and interchange of metadata among different types and classes of enterprise data management tools. This group of vendors, called the Metadata Council, came together in July 1995 because of a shared belief in the increasing need to standardize metadata access and maintenance.
The Council's initial meetings were facilitated by the Meta Group's Application Development Strategies group, and were led by Karen Rubenstrunk, a senior research analyst at Meta Group. During these meetings, the Council developed its mission and charter, drafted a process definition and preliminary metadata standards framework, and formulated the concept of an industry-wide Metadata Coalition, which is a group open to vendors and end users who would like to participate in shaping the metadata standards. The original Council now acts as the steering committee for the Metadata Coalition, coordinating the standards definition and ongoing evolutionary process.
Now that the Coalition is on its way toward developing a standard metadata interchange, Meta Group has stepped out of the picture. But Karen Rubenstrunk continues to be a voice for the group, and she understands all too well the issues, problems, and hurdles the Coalition will likely face. Features Editor Theresa Rigney and Technical Editor Maurice Frank recently spoke with Karen over the phone to discuss these issues. An edited transcript of their conversation follows.
DBMS: How is the Meta Group involved with the Metadata Coalition initiative?
RUBENSTRUNK: Meta Group is responsible for facilitating the origination of the Metadata Council. We took the lead role in pulling together a number of different vendors who had expressed an interest in integrating metadata. We basically acted as a facilitator to see if these people really were interested and if they felt that a coalition was a viable option for providing some sort of standards in the marketplace. I facilitated work sessions among the vendors, kept them on track, and kept pushing them to meet deadlines and get the announcement out. Now that it has actually been announced and the Council is on its way, [Meta Group] has stepped quietly out of the picture.
DBMS: What is the Council's definition of "metadata"?
RUBENSTRUNK: Metadata is data about data -- it's pretty straightforward. See, data content is one thing; the content gives you an address. But you may want to understand when that address was created, what system it came from, and what different tools have accessed it to move it from where it was originally to where it is now. If I need to change it or access it in some other way, who's in charge of it, who owns it, who's the steward for it? So it's data about data: It's all the things that surround the actual content of the data to give a person an understanding of how it was created and how it is maintained.
DBMS: So, knowing that, what are the goals of the Metadata Interchange Coalition?
RUBENSTRUNK: The goal is to provide a standard interchange by which metadata can be exchanged from tool to tool a lot more easily than it is today. [See Figure 1, page 48.] Even though the Metadata Interchange has what I call a global focus, it is trying very hard to not be a data warehouse initiative. Most of the problems have been emphasized because of the tremendous growth within the data warehouse market. Every tool that's being used out there creates metadata of some form, and there are no easy ways for an IT person to get all those different iterations of metadata to speak to each other. So the IT person has to physically integrate it or rely on partnerships among the vendors of the particular tools.
DBMS: Who belongs to the Coalition?
RUBENSTRUNK: The original founders were Arbor Software, Business Objects, Cognos, Evolutionary Technologies, Platinum Technology, and Texas Instruments. The vendors that have since committed to and paid money to the Coalition [as of press time] are IBM, Information Advantage, Informix, Intersolv, EDS, PeopleSoft, Prism Solutions, Sybase, R&O, and Carleton.
DBMS: What does it cost to join?
RUBENSTRUNK: The dues for Coalition membership are $2500 per year.
DBMS: Are there any competitors out there?
RUBENSTRUNK: No. In fact, [some of the vendors] initially viewed [each other] as competitors; but we say that they're not in competition at all (in fact, it's all complimentary) -- there are a couple of vendors out there who have started their own initiatives for metadata standardization: IBM, Prism, and Oracle.
DBMS: So these vendors are not going along with your group?
RUBENSTRUNK: No, they are going along -- they're members of the Metadata Coalition, but they also have their own vendor-focused or vendor-centric initiatives, where they're trying to get all of their partners to standardize with their tools.
DBMS: Are there any vendors that are simply refusing to join any type of coalition or initiative?
RUBENSTRUNK: As of [this interview], Oracle hasn't joined. IBM has, Prism has, Informix has, Sybase has, numerous other vendors have, but Oracle hasn't made a specific commitment. I don't know if it's intentionally holding back, but I do think (and this is a personal opinion) that Oracle likes to view itself as separate and apart from everyone else, and that its solution is better. I know Oracle is working on its own initiative with all of its data warehouse vendors that is similar to what the Metadata Council is trying to do -- some type of interchange with read/write that works with its entire product suite and not just its data warehouse. But Oracle has not always had a very strong commitment to anything outside of its own tool suite.
DBMS: Is Microsoft involved?
RUBENSTRUNK: I don't believe that Microsoft has joined yet. However, I think it has asked for a membership kit.
DBMS: What is the state of metadata today?
RUBENSTRUNK: I may get into some religion here. Really, the state of metadata depends on your own company's culture and architecture; these factors determine how much you really care about metadata integration. If you are a True Blue shop -- a very homogeneous shop -- you're not going to have issues with metadata. And the issues with metadata that you may have are specifically in the area of: Do I have a common definition for my customer, or account, or product? That metadata actually takes on a different form, and it's the metadata that tells people the business definition of this thing that I'm talking about. In that context, metadata synchronization takes on the flavor of business synchronization.
As you get more and more heterogeneous, metadata becomes more of an issue. But, if you are also a very decentralized shop or a shop that culturally or historically has had numerous different tools, and you've learned to live with having to jump from tool to tool, or the IT organization has accepted its role as systems integrator, then metadata is still an issue, but it's not a major issue. So really, metadata itself is a very powerful piece of information, but it only rears its ugly head or presents tremendously insightful solutions when your company understands how to deal with it -- or is forced to deal with it.
Will the Council's scope try to specify standards that are applicable to both relational and multidimensional DBMSs, or do they need separate standards?
They are not separate standards. What the Council is trying to do is categorize the types of tools that are out there, and present broad categories for them. There are decision-support tools, there are data-movement tools, there are repository tools, there are database servers, and so on, and so on. The Council will try to categorize these tools because it believes that within a category, a lot of the higher-level metadata starts to look the same. Therefore, it will be easier to provide some sort of standardization. Council members specifically discussed the relational database versus the multidimensional database issue, and the white paper does discuss the fact that, under multidimensional, a common piece of metadata is dimension. However, we were really looking toward the OLAP Council to provide content for standardization between a physical multidimensional database and everything else. So, we said we were going to provide an interchange that is capable of translating metadata from any type of tool to any other type of tool. When it gets to the point where the OLAP Council actually provides a real standard, it should be something that hooks very easily into the interchange. The OLAP Council is not in competition, it's just that we're not going to do their work for them. They're out there doing the work already, so more power to them.
DBMS: What about objects, as opposed to data-only databases? Will the Council address object metadata?
RUBENSTRUNK: I doubt it. At least, not in the first couple of iterations. The Council is trying to be very practical. It is saying, if we look at this issue globally and consider everything we want to do, we will spend a year and a half just planning what we want to do. So let's be practical: What's the easiest, most practical thing we can do in the next six months to bring about some sort of initial change? The practical thing is an API that probably runs in batch instead of online, does read-only and not update, and has an encryptable piece so that a vendor can pass along proprietary data at the same time as it passes along the other data. The Council purposely has taken --I don't know if you'd call it the high road or the low road -- the position that says, let's be practical. And it will judge everything against the "let's be practical" slogan. Members believe that they have to provide an interchange that allows both update and read, with an appropriate audit trail. However, the first thing they're going to do is provide read, because at least they can get the read out in six months. And then they'll go on to the next one.
DBMS: Once the standard is out there, how long do you think it will take for real products to be revised so that they observe it?
RUBENSTRUNK: The original Council members have dedicated themselves to the goal that within six months of the actual final decision, their products will be compliant. I don't think they've set a standard or a benchmark for the members of the Coalition, but the Council members themselves have promised six months. You'll probably see the first trickling of tools using the interchange probably by the end of 1996. You won't see any mass adoption until 1997.
DBMS: What will happen to the existing metadata implementation in query tools and other products? Will vendors start over or try to adapt?
RUBENSTRUNK: I think they'll adapt them. Another thing that the Metadata Council is trying to do is provide something that doesn't require a vendor to do a whole lot more than just write to the standard. Vendors shouldn't have to rearchitect or change their tools in major ways, but they will have to change them in some ways. Again, the Council is trying to be very practical. Council members know that they're going to get some analysts saying that it's not good enough, and they know that they're going to get some people saying that it's not deep enough to be effective. But the Council's view is: You know what, it's better to have something than to have nothing. The Council is trying to provide something that is helpful but doesn't require major rearchitecting of a product (otherwise, vendors won't do it).
DBMS: Have you seen any interest from CASE vendors or application development tools vendors? Their products are often metadata-intensive.
RUBENSTRUNK: Texas Instruments was one of the original members of the Council, and it was specifically asked to join by the other members because they recognized that there was a major area that they weren't covering --model-based development, which is very metadata-intensive. So TI was one of the founders. In fact, as an aside, in the work sessions, the folks from TI were the ones who usually had the ability to add a little bit of practicality, because they really, really understood metadata. They were able to put people's feet back on the ground.
I believe that Seer Technologies has also inquired about what's going on, and there's a tremendous amount of interest from the application packages vendors. It's kind of exciting and it's kind of scary at the same time: It's exciting to see that all facets of what IT is involved with right now have expressed interest in this, and large European companies -- large application development tool companies -- have called. The momentum, while exciting, is also very scary. I wonder sometimes if we should have kept the Council to six members, had them do something, and then showed the rest of the industry what they did.
DBMS: How will this interchange be associated with the ODBC specification? Will the interchange be a companion to ODBC?
RUBENSTRUNK: Intersolv has suggested that it has technology that the Metadata Interchange could use, but the real discussions have not yet been addressed. We kind of see them as two different things. Intersolv's involvement in the coalition is going to bring to light that there may potentially be some synergy, not overlap, there.
DBMS: Will the Council attempt to get different vendors to agree on what different terms mean? Some people use the same word to mean two different things, which is part of the problem of trying to work with multiple tools!
RUBENSTRUNK: Within the development of the interchange, there will have to be some agreement on terms and definitions, because within each class of tool there's going to have to be some agreement as to what a dimension is, what a component is, and so on. So there's going to have to be some information engineering-type data modeling to get all of these people to agree upon some basic definitions and the appropriate formats for them. However, I don't think they're going to try to do it for everything, only for those things that specifically need to be in the Interchange. I think they'll have some success, because they'll do it across tool categories and not tool by tool. In other words, when it gets into the classification of tools that are not decision support, that tool profile will have to be standard across all decision-support tools.
DBMS: Previous efforts to standardize metadata have failed. Why is the time ripe now?
RUBENSTRUNK: The time is ripe now because of data warehousing and decision support. [Meta Group is] seeing the data warehouse market grow 40 to 70 percent; in fact, look at Arbor Software's recent IPO. That alone shows that there is a tremendous interest in the data warehouse area. Some people look at data warehousing as being just a marketing ploy -- there's no new technology, just marketing for client/server-oriented vendors. There's a little bit of truth to that, but the fact of the matter is that decision support has become mission-critical for companies. Having a good decision-support system is no longer a competitive advantage, it's mission-critical -- you have to do it. If you don't, you're going to fall behind.
We also think that there's going to be a trend in the application development organization itself, in which probably 30 to 35 percent of the application development resources are going to be deployed for developing decision support-type applications, rather than for the traditional operational supporting applications. The data warehouse market is going to grow to about $11 billion by 1998 -- $8 billion in hardware and software and $3 billion in systems integrator services. [These are Meta Group statistics.]
All of this shows that decision support is really, really, really important. Therefore, data warehousing has become extremely important. In the data warehousing model, some of the companies that you see on the original Council said, "Listen, I did really well for the first couple of years, and I have this nice continual growth path, but I think that I'm going to hit a wall. And I think that I'm going to hit a wall in my product sales, because this thing is getting too complex, and my users can't grab onto it quickly enough. I'm actually beginning to see some data warehousing efforts fail because of the complexity!" and so on. So, while there's this pressure to provide very succinct focus on decision support, the first couple of years of the tools' and the vendors' growth is the easy period. But now we're finding all the holes and the ugliness. That's why the time is ripe.