Internet Systems

The Object Database Goes Online

By Nelson King
Internet Systems, January 1997

Can the Internet Help the ODBMS Gain Acceptance as the Undisputed Master of Complex Data Management?


Some people feel the match between the Internet and object database management systems (ODBMSs) signals a change in epochs for data management. Others say, no way, the Internet will support many kinds of database management -- and evidence for the superiority of one approach over another will be a long time coming. Frankly, these opposing positions depend on the degree of belief in the present and future importance of objects, which at the moment is almost as much a matter of faith as fact. Still, the relationship among objects, the Internet, and data management is a tough and important issue, and now is a good time to take stock of what's happening with ODBMSs and the Internet.

Data Management Requirements

Although object-oriented programming has become the standard of the software development community, for object databases it's still an uphill struggle to gain acceptance. Now, along comes the Internet (and Intranet) phenomenon, and suddenly a new set of data management requirements appears that seems to favor the object-oriented approach. To explain why this may be so, I'll start by describing some basic elements of Internet data management.

The relationship between database management and the Internet is evolving, but at the moment three facets can be discerned:

Internet information publishing -- essentially, document management --usually consists of a Web server locating files and displaying HTML pages on demand from a Web browser. It seems fairly obvious that this process can be performed more efficiently by storing HTML pages (or any other document format) in a DBMS. The next facet, simple transaction processing, amounts to getting data entry and output through a browser, which is standard data management. Finally, there are complex interactions, such as a browser session that involves relationships with HTML pages, user data input, and queries, as well as the activity of applets or other program components (Java, ActiveX, and Plug-Ins). Here, data management is appropriate for all of the elements, plus management of metadata (such as recording user activity) for coordination. None of this management must be done by a DBMS, but for reasons of performance, security, and scalability, it seems to be required by the Internet (primarily as the Web).

In all aspects of Internet database management, it appears inevitable that a high percentage of the data will be nontraditional; that is, in addition to text and numbers, there will be complex data of all types (sounds, images, and video). Dealing with a wide variety of data types, often entailing large volumes of data, may soon be considered a requirement for Internet data management. (See Figure 1.)

One more important piece of the Internet/Intranet mix is an uneasy relationship between Web servers and data management systems. You can see why when you look at Figure 1. Typically, the user clicks on something in the browser that triggers a call to a Web server passing a URL and perhaps a query statement as parameters. The Web server uses the URL to find a script that, in turn, calls an external program, which does the talking to the DBMS. The DBMS engine retrieves the data or objects and either does the HTML formatting or sends the data back to the program or CGI script for formatting. Then the Web server receives the formatted results and feeds them to the browser for display.

This process is obviously complicated, with possible performance penalties all along the way, especially when in demand by hundreds of browsers simultaneously. Improvements to CGI, such as Netscape's NSAPI and Microsoft's ISAPI, don't fundamentally alter the architecture, whereas incorporating Web server functions in the database server is a more significant architectural evolution. Dealing with this server process, or going around it, is a requirement for Internet database management.

At fundamental levels of data management, there's nothing very new in the Internet requirements; however, the novel combinations, high volumes, and technical nuances present an awesome challenge -- and opportunity -- to the ODBMS community.

ODBMS and Internet Requirements

What kinds of data management are appropriate for the Internet -- or more to the point, fit the requirements better? For information publishing, or for managing Web pages or other documents, a flat-file data manager can handle the job, although an RDBMS or ODBMS might do it better. However, HTML pages with large amounts of embedded multimedia data or Java-style applets that make numerous dynamic calls to the database engine may cause problems. This is one area where an ODBMS has an advantage.

Flat-file and relational systems traditionally handle multimedia data as Binary Large Objects or BLOBs, one BLOB to a file or row-column intersection of a relational table. As such, little or no information about the BLOB is stored, and the content of the BLOB is typically not searchable.

Object systems, on the other hand, incorporate the BLOB as part of a complex of metadata, information about the object, which enables more sophisticated manipulation, including some kinds of searching. For similar reasons, an ODBMS can support new data types much more readily than a flat-file data manager or RDBMS.

A loaded HTML page is actually a compound object, which an ODBMS can excel at assembling. Web servers and some RDBMSs use operating system file services, with all of the disk access overhead of loading an arbitrary 4K (or bigger) chunk, but most ODBMS engines retrieve only the actual size of the object and may also store it in memory cache for subsequent use. This relative efficiency is one reason why many ODBMS engines are being adapted to take over the job of the Web server -- or at least the data management portion.

The data caching that is essential for ODBMSs also makes them excellent proxy servers. These are servers used by Intranets, usually behind a firewall, to cache frequently accessed information, which speeds up browser retrieval and cuts down on traffic through the Internet gateway.

Transaction Processing

Simple transaction processing is considered RDBMS territory, especially with large volumes. Relational systems are efficient at making a join between one or two tables, especially to retrieve many rows from a one-to-many relationship. Because most standard business data still follows this model, and much of the world's data is now stored in an RDBMS, there is no reason to expect that RDBMS engines won't be used to supply and gather a lot of Internet data. This reality is so fixed that ODBMS vendors are at pains to provide ways of connecting to common relational databases. They know they will have to share at least part of the stage with the giant RDBMS companies.

There's a caveat to the transaction prowess of the RDBMS. On the Internet (Web), it's expected that multimedia data will be an integral part even of simple transactions. For example, I can imagine a catalog transaction that requires simultaneous presentation of product images and schematics combined with traditional data-like descriptive text and prices. RDBMSs don't handle this mix well; for this reason, many major RDBMS vendors are turning to schemes that incorporate some of the object database approaches, the so-called Object Relational DBMS (ORDBMS).

The ORDBMS

In most cases, the object part of an ORDBMS is an extension to the relational system specifically to handle complex data types, including multimedia data. As an example, Informix Software Inc. purchased Illustra Information Technologies Inc. and its object database technology, DataBlade, and is welding it into the main Informix RDBMS to form what is being called a Universal Server. On the Internet, Web DataBlade is a collection of tools that, among other things, embeds SQL in HTML documents for quick retrieval and then generates HTML pages from the stored data, including multimedia and hypertext links. Oracle, IBM, and other RDBMS companies have similar products in the works (with or without acquisition).

ODBMS people sneer at these hybrid solutions because they offend their notion of object theory, and because they feel they're a "kludge." Nevertheless, the approach works. It provides more flexibility to the relational system and solves some of the problems encountered in transactions with multimedia data; however, the hybrid approach may not be sufficient for true electronic commerce.

Electronic Commerce and Complex Interaction

We've all seen Web pages with pictures of products and places to enter orders; but these pages are only the simplest form of electronic commerce, and nowhere near the visions of its ultimate form. Here's an example of what that might be like:

Suppose you're interested in buying a car. You find a Web site that has cars for sale, and your session begins. Off the bat, the site gets your name and uses a DBMS to look up information about you, especially if you've shopped there before. Directly or indirectly, you're being qualified. Perhaps a Java applet runs to get some information from you, and when the program is satisfied that it has a starting profile of your wishes, it begins to feed you information interactively: color photos of cars, videos of cars in motion, stat sheets, financing information, schematics, perhaps a salesperson's voice-over of features, listings of local dealers. It might even broker a pricing auction among dealers for your purchase. Perhaps 10 or more applets and special controls are involved, along with megabytes of data, including some metadata about your profile and the status of the current session.

In the relational approach, this session would require a vast number of tables, involving very complex data types and a network of persistent joins, as well as on-going data entry and numerous interactive SQL queries. Each query statement would need to be parsed, the joins optimized and created, and then records searched one by one to match criteria. This is called overhead, and as the number of tables in each query grows, the performance of the queries would decrease. Executing scores of queries per browser session such as the one described wouldn't be unusual. Even with an ORDBMS, the process would prove disk-intensive and time-consuming.

With an ODBMS, the session object and all of its constituent objects would be defined in one place; the database engine would simply pick up or connect to the relevant objects. Even the Java applets and their referenced objects could be stored and organized by the ODBMS. The ODBMS can usually go directly to data, following navigation paths to objects that are mapped ahead of time. To obtain this direct access, object database systems use pointers to locate data, usually coupled with hash tables and other lookup techniques. An ODBMS typically puts as much of this data as possible into RAM, or at least into a ready cache. On this basis, at least in theory, ODBMS vendors claim performance superiority.

However, although performance is important, the main issue remains management of complexity. An RDBMS schema would need to do backflips to model a complex interaction such as the one I describe, and it would probably find it impossible to keep up with the dynamic relationships among the objects during the session; the ODBMS approach is predicated on a dynamic schema and can provide appropriate tools to model the session and guide its progress.

In this view of electronic commerce, the key issue is not how to shovel around large chunks of multimedia data efficiently, but how information will be organized on the Internet. Without objects, object distribution, and object management, the ultimate commercial uses of the Internet won't be possible -- or the solutions will be an unreliable technical patchwork. Microsoft, IBM, Sun, Netscape, and a lot of other companies have already reached this conclusion. This is also where Internet database management meets the world of distributed objects (CORBA and DCOM). Complex interaction among objects is the turf staked out by the ODBMS vendors, obviously because storing and managing objects and their relationships is what an ODBMS does best. If it turns out that objects are, indeed, the key to the future of the Internet, then object databases are going to have, at the very least, an edge.

ODBMS Products for the Internet

While maintaining as a matter of course that the world is coming to objects, most ODBMS vendors see the Internet/Intranet as the driving "case in point" -- the practical example where object technology has a clear advantage and object databases fit perfectly. Surprisingly, the rush to the Internet is not universal among the ODBMS companies. Out of 16 I've identified, only half currently have Internet products. A couple don't even have Internet sites. (This shortage makes me wonder about the commitment or acuity of the companies, but it could also reflect their small size, lack of resources, or inability to deal with the current state of flux in object standards.)

During the fall of 1996, the pivotal Object Database Management Group (ODMG) was still hammering out important specifications, such as the interface to Java Database Connectivity (JDBC). In the meantime, the Internet juggernaut races on, with other object-related standards, such as HTTP vs. Internet Inter-ORB Protocol (IIOP) and CORBA vs. DCOM, going through convulsions. You can hardly consider this environment convenient for product development; but I'll cover a sampling of ODBMS vendors that have chosen to damn the torpedoes and Web-enable their products.

Versant Tackles Niches

Like other ODBMS vendors, Versant Object Technology Corp. offers multiple Internet products that take advantage of niches where their message carries the most credibility. For Versant, this strategy stems from its experience in the telephony industry, where customers such as AT&T have been using the Versant engine for large-scale switching and routing management. Currently available Internet products include:

Object Design Enhances ObjectStore

Object Design Inc. currently has the most complete lineup of Internet products, all of which complement the main ODI product, ObjectStore ODMS: ObjectStore also offers products, including ObjectStore Dbconnect and ObjectStore Open Access, that provide two-way access with data in an RDBMS or other databases.

O2 Technology

O2 Technology Inc., a French company with offices in Palo Alto, California, started from a consortium of academic and corporate research work during the 1980s. Its primary product is the O2 Database System, a fully ODMG-compliant object database engine.

Dominant Objects

The Internet/Intranet is a great opportunity for the ODBMS companies, but often their ultimate success is predicated on the Internet moving to the prognosticated world of electronic commerce and distributed objects, at which point the ODBMS way of managing data seems not only natural but indispensable.

In the meantime, however, other pertinent issues remain. Some of these issues will hound all database companies as they approach the crowded traffic of the Web: performance, reliability, security, and scalability. But ODBMS vendors must also prove that their data management model intuitively suits the Internet and that their individual products effectively use the model to satisfy the requirements of Internet database management.

In the time before electronic commerce and distributed objects become a reality, ODBMS vendors will get a fairly lengthy opportunity to test their products against the volume and complexity of the Web. The ODBMS approach, based as it is on a network of pointers, introduces complexities and difficulties of its own. If nothing else, for many users ODBMSs are a new paradigm that must be understood and approved by DBAs (and their supervisors). Because ODBMS products are relatively new and possess a mixed track record, they have much to prove before achieving mainstream acceptance.

This need for a mainstream audience highlights the greatest problem faced by the ODBMS companies, which is not technology nor even competition: It's customer ignorance, including a general lack of knowledge about how important objects are going to be for the Internet and what role the ODBMS will play in that scenario. As the history of the computer industry has shown so often, the best or most appropriate technology doesn't always win. Marketing and customer inertia have an impact. Perhaps ORDBMS approaches from the entrenched giants of the database industry will be "good enough," even if they don't offer everything a true ODBMS does.

Regardless of their fitness for the Internet, ODBMS vendors need clarity of concept and superiority in execution. If you're looking for signs of their success, you'll know that object databases (and probably electronic commerce) have arrived when a Microsoft or an IBM buys one of the ODBMS companies outright, so they can claim an ODBMS of their own.


Nelson King has been a database application developer for over 15 years and has written eight books on the subject. You can contact him via email at nhking@winternet.com.

Figure 1.


--The Web server database process involves a complex conversation among the browser, Web server, scripts, external programs for communicating with the DBMS, and, finally, the DBMS itself.


Subscribe to DBMS and Internet Systems -- It's free for qualified readers in the United States
January 1997 Table of Contents | Other Contents | Article Index | Search | Site Index | Home

DBMS and Internet Systems (http://www.dbmsmag.com)
Copyright © 1997 Miller Freeman, Inc. ALL RIGHTS RESERVED
Redistribution without permission is prohibited.
Please send questions or comments to dbms@mfi.com
Updated Friday, December 13, 1996.