A tremendous amount of resources are being used in enterprises to build data warehouses and data marts. This type of decision-support activity is being performed as part of the IT mainstream. Product vendors, systems integrators, and consultants are mobilized to help IT in their efforts. But often, after investing much hard work and resources, business users are disappointed with the results. Did the IT groups, vendors, and consultants miss something?
Database query tools have proliferated over the past few years. There have been more than 100 of these tools in the marketplace at various times. Despite allowing business users access to virtually any database that IT can build, these tools have not gained the widespread usage that spreadsheets or word processors enjoy. How do business users locate information with these tools? How do they know what the data represents? How do they get the information they need? Without being able to answer these questions, business users cannot make effective use of these tools or the data warehouse.
It is very common for IT to follow the philosophy of "If we build it, they will come." This philosophy is reinforced by the "data explorers" who are self-sufficient with new technology and eager to find new information assets. Data explorers are users of the various query and OLAP tools who enjoy exploiting new technology in their jobs. They delight at the success of finding new pieces of information while using these new tools. Data explorers have a disproportionate influence on all parties building data warehouses. They create the false expectation that business users will leap at data warehouses and find new, exciting information jewels previously locked in data basements (legacy applications to which business users could not or would not gain access). Typical businesspeople need some help and support in that endeavor. They will not invest the time in the new technology just for the joy of using it.
The cornerstone of the BID is the "M" word: metadata. (See Figure 1.) IT personnel cringe and business users' eyes glaze over when metadata is mentioned. Metadata, however, is a means to an end -- an enabler to the desired goal of making decision-support data accessible to the business community throughout an enterprise. The two usual approaches to metadata are at opposite ends of the spectrum: It is either ignored or praised with zealous fever. If ignored, metadata will proliferate with every tool brought into the data warehouse environment. If approached as a "religion," it will focus IT on the wrong issues. The balanced approach is to place it as a resource to be harnessed in successful decision-support environments.
Metadata is data about data. There are two categories of metadata: technical and business. Technical metadata is the description of the data needed by various tools to store, manipulate, or move data. These tools include relational databases, application development tools, database query tools, data modeling tools, data extraction tools, online analytical processing (OLAP) tools, and data mining tools. Business metadata is the description of the data needed by business users to understand the business context and meaning of the data. Technical metadata has spread like wildfire across the enterprise as more tools and types of tools are used to build decision-support systems (DSSs). Business metadata is contained in the business requirements and specifications for DSSs. It is often only online in the Word documents used in designing these systems. After it is used in the design phase, the business metadata is generally "shelfware" (collecting dust in three-ring binders on the business analyst's shelf).
Second, the BID promotes business understanding. Just knowing that data exists is not enough. What the data represents is crucial to business users. They need to determine if the information is pertinent to them and how to interpret it. Terms such as sales and profit can mean vastly different things to various business groups within an enterprise. Business users need to understand the context of the data in order to use it properly.
Finally, once the business users know the data exists, they want it. They may want to access it now, or they may want it delivered to their desktop on a regular basis. The latter would be necessary for them to perform repetitive tasks such as weekly or monthly reports. Business users, accustomed to double-clicking on links on a Web page, want similar functionality in their decision-support systems.
Recently, some of the more sophisticated query tools have been created as Managed Query Environments (MQE). This is an attempt to make the query tools more business-user friendly by using business terminology in developing the queries. An MQE accomplishes this through a semantic layer (metadata) that replaces the physical names of tables and columns with views and synonyms with business terms. This can be viewed as a limited information catalog. A great enhancement over the earlier generations of query tools that presented physical table and column names to end users, MQEs should be a selection criteria when query tools are being evaluated. But their semantic layer, or information catalog, is too limited to extend across the data warehouses, data marts, and so on that are needed.
The BID's initial targets are the data farmers of the business community. They need an information catalog they can search for information, understand it, and get it. It is important to note, however, that if a BID was available to the data explorers and IT personnel, they, too, would benefit because they could exploit the data warehouse more effectively. Data explorers and IT personnel, however, may not perceive the need for a BID because they think they already have tools to access the data warehouse.
The target market shapes what functionality the BID offers, which in turn determines what is stored in its information catalog. Vendors, consultants, and IT all have the data explorers in mind when considering the need for or designing BIDs. Table 2 examines the difference in interpretation of BID functionality between the data explorer and data farmer. In fact, from the data explorers' point of view, an information catalog may not be as critical because they are willing to search for information on their own. However, as previously noted, data explorers would benefit significantly from a BID.
The BID serves two purposes for the data farmer. First, it acts as the librarian who researches what information is available and pertinent for the business user. Second, it is a mail-order catalog from which business users can order the information to arrive when they need it. This latter purpose is similar to PointCast in that business users want the information delivered to their desktops to use in their work.
The Information Navigator is the business user interface. It provides the navigation, understanding, and access functionality for the BID. It interacts with the other BID components, as well as invoking various tools to access and manipulate information by the business user. This is the business user's view into data warehouses, data marts, workgroup databases, and personal databases.
The Information Catalog is the brains of the BID. It stores the metadata needed to provide BID functionality. Various import and export facilities as well as APIs are used to move metadata between different metadata sources and the BID.
The Administrator is a superset of the Information Navigator. IT also uses this interface for BID administration. These functions include maintaining the Information Catalog, managing business users access capabilities, maintaining security, and updating metadata not handled by the Import/Export capabilities.
The Information Delivery Agent moves the information requested by business users to their desktop or workgroup applications. This is equivalent to a push model in which the business user requests information to be delivered and it is published onto the user's desktop.
It is also a poorly understood market. Most vendors do not understand what the business users' needs really are. Vendors usually work with IT groups and therefore view the need for a BID through IT's eyes, which leads to a belief that users simply want access to databases. But this functionality is just the means to an end. The real objective is information access, which means finding and understanding the information in business context but not how a database administrator would find it. In addition to the vendors, IT also does not fully appreciate the extent of the problems and needs. Most IT people are too busy to deal with metadata. Because of the ever-increasing pressures to deliver projects quickly, items that do not have a perceived immediate impact, such as metadata, are postponed. And those IT groups that do not postpone dealing with metadata are frustrated by vendor solutions that are, at best, partial solutions addressing a limited set of metadata sources.
BIDs are also very diverse in nature. Most BIDs were created during specific customer engagements or as add-ons or extensions to existing product lines. The products from Prism Solutions Inc., Platinum Technology Inc., IBM Corp., Logic Works Inc., and Virtual Integration Technology Inc. were all initially built under these circumstances. As such, they address the particular metadata integration needs encountered for that specific engagement or product line. The resulting BIDs need to be expanded to meet the wide variety of environments encountered in the general marketplace. In addition, the engagements in which the BIDs were created were consulting or specific IT projects, with a lot of personal attention paid to tailoring them to be successful. With the move to a commercial product, the extensive consultative support is eliminated, and implementation success is greatly diminished.
The Prism Warehouse Directory was a natural extension of the Prism Warehouse Executive -- a great deal of the technical metadata for the BID was already available. The initial releases of the PWD were geared toward IT and data explorers and oriented toward the physical aspects of storage and transformation between sources, which was the purpose of the PWE. At that time, the BID was a totally passive catalog; users found references to the information they desired, wrote down where it was located, and then went into other tools to access the data.
This BID has progressed significantly since its inception. Prism has partnered with several vendors to create import and/or export capabilities with repository, CASE, data modeling, and MQE tools. This greatly expands the metadata available in the information catalog. In addition, Prism has added the capability to launch applications once information is located. This moves the BID from a passive to an active catalog. Prism Warehouse Directory Web Access allows Web access to the BID and expands access to data by enabling users to build and launch queries to databases.
The Prism Warehouse Directory has been installed by approximately 100 companies. It has three components: Directory Builder (administrative tool), Directory Navigator (end-user tool), and the Information Directory. It can be purchased standalone at $50,000 with five Navigator seats or bundled with the Prism Warehouse Executive. Almost all purchases of PWD are bundled with PWE.
Although it has made great strides in expanding its audience, PWD is still centered around the sourcing of data into data warehouses or data marts. This is a key application of metadata, but it is still technically oriented and will appeal to IT and data explorers. If you are already a PWE customer, it is natural to utilize PWD. If you are not using PWE, you should evaluate other options.
The metaphor used is that of file cabinets and folders. Information content is organized into "file cabinets," which are logically business subjects or topics. These are further divided into business categories. Business rules, logic entities, data structures, data elements, and data usage tabs are also provided.
Data Shopper is marketed as a tool for business users to browse and understand what is contained in a data warehouse (via a repository). Business users can find information that they might not have otherwise known existed. They can identify, understand, and locate objects such as database tables and columns, queries, reports, spreadsheets, Word documents, application programs, and other information stored in repository.
Data Shopper lists for $500 per seat, with volume discounts applying. However, Platinum Repository is required for the information catalog. The MVS version will easily sell for more than $100,000, and the Open Edition will approach $100,000 when loaded with various options. So the cost of admission is more than $100,000 and buying into the use of Platinum Repository. The merits of repositories in general and Platinum's in particular are beyond the scope of this article. If you have the Platinum Repository, you should implement Data Shopper. If not, then first consider whether you should purchase Platinum Repository on its own merits.
DataGuide provides business users with an information catalog containing metadata about both structured (databases) and unstructured (files) data. This data is treated as an information object and can be grouped together in a variety of ways. The information catalog is extensible, with the capability to add different types of objects. Import and exports are achieved through published APIs or through a published command language interface. Initially, the only metadata exchange occurred among DB2 family products, but partnerships with market-leading OLAP and MQE vendors have expanded this capability.
DataGuide consists of three tools: DataGuide User, DataGuide Administrator, and Information Catalog. The User interface presents a tree structure of objects that the business user expands to get the contents of folders or more details. Business metadata and help are available on each object. Once information has been found, the business user can launch an application to access that information.
DataGuide has been installed at approximately 100 companies. It costs $209 for the User tool and $1,149 for the Administrator tool; volume discounts apply. In addition, a version of DB2 on NT, OS/2, or MVS must be purchased for the Information Catalog. This is the lowest-cost tool examined in this article, but that does not equate to usefulness or functionality. The only prerequisite that may hinder its implementation is the use of DB2/NT or DB2/2 for its Information Catalog. It would be more robust if the other major relational databases were also offered. But the cost of DB2/x is low and its use is limited (note: the data warehouse can be in any relational database, it is just the Information Catalog that needs to be in DB2/x), so this should not be a criteria to reject this BID. It is well worth the cost to explore this BID as a starting point for implementing BID functionality.
Universal Directory uses a three-tier architecture with the following components: Universal Explorer (business user interface), Directory Administrator (administration tool), Data Server (manages flow of data between clients and information directory), License Server (manages concurrent use of client tools), and the Information Directory (stored in Microsoft SQL Server, Sybase SQL Server, or Oracle). ModelMart, which handles the model management database (stored with the Information Directory), is also required. Other optional products that integrate with these tools are ERwin/Open and ERwin/Navigator (used for viewing and editing data models, including star schemas), Micro Focus Revolve (used for scanning legacy data), and Sterling CLEAR:Access (query tool used to access a data warehouse). Clients work on Windows 95 or Windows NT while the servers work on Windows NT.
Universal Directory sells for $30,000 for 10 Navigators, one Administrator, and one ModelMart. The company had at least a half dozen purchases as the product was formally announced. The product is very new and does not have extensive metadata import and export capabilities. Logic Works' approach does favor IT and data explorers, especially those familiar with data modeling. However, the company has included BID capabilities to attract the data farmer. This is definitely a tool to watch and evaluate as it matures.
The VIT deliveryMANAGER components are deliveryAGENT, metaWAREHOUSE, and deliveryADMIN.
The deliveryAGENT is the Web browser or Windows user interface to the information directory. Both structured and unstructured data can be cataloged and delivered. Information is arranged as information objects called collections. Business users search for information by subject and topics of interest; they can also obtain relevant business metadata. Business users can subscribe to this information and have it delivered to their desktops, file servers, email, or Web servers. Data delivery can be based on time or events.
The metaWAREHOUSE is the information catalog (currently stored in Oracle) that integrates technical and business metadata. Both structured and unstructured data can be cataloged.
The deliveryADMIN is the administrative tool used to manage the information directory. It handles user security, registration of all information objects, the building of collections, and monitoring information usage. This is implemented on Unix and Windows NT.
The VIT deliveryMANAGER costs $50,000. VIT is a consulting firm that is transforming itself into a product company. It has obtained venture financing but had funded initial product development through consulting engagements. deliveryMANAGER has approximately 10 installations. deliveryMANAGER is the only BID mentioned that has implemented an information delivery capability in addition to the information discovery and understanding functions. It is based on a well-engineered technical architecture and has obtained hands-on implementation experienced while developing deliveryMANAGER. It is well worth evaluating, with the biggest qualification being the risk level associated with a startup.
Data warehouse and data mart projects need to incorporate metadata management and BIDs as part of their objectives. Even with the immature state of the market, the currently available products offer advantages over ignoring these issues and capabilities. Many of the early data warehouse projects built their own BIDs, which is still a viable alternative. However, many IT shops today do not have the resources or time to implement their own custom-built solutions.

| TABLE 1. The Potential Customers for a Business Information Directory | |
|---|---|
| Class of User | Category |
| Business Users | Data explorers Data farmers |
| IT Staff | Data warehouse builders Data warehouse operations Decision-support application builders Business analysts |
| TABLE 2. BID Functions for IT vs. Business Users | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| BID Functionality | IT Needs | Business Users (Data Explorer) Needs | Business Users (Data Farmer) Needs | ||||||||||||||||
| Information Discovery | |||||||||||||||||||
| Business Understanding
| Data Access and Delivery
| | |||||||||||||||||

What did you think of this article? Send a letter to the editor.