Modeling Object/Relational Databases. CASE vendors are beginning to support universal servers, but the tools are still immature. By Seth Grimes.
DBMS, April 1998

Object/relational database management systems (ORDBMSs) add new object storage capabilities to the relational systems at the core of modern information systems. These new facilities integrate management of traditional fielded data, complex objects such as time-series and geospatial data and diverse binary media such as audio, video, images, and applets. By encapsulating methods with data structures, an ORDBMS server can execute complex analytical and data manipulation operations to search and transform multimedia and other complex objects.

As an evolutionary technology, the object/relational (OR) approach has inherited the robust transaction- and performance-management features of its relational ancestor and the flexibility of its object-oriented cousin. Database designers can work with familiar tabular structures and data definition languages (DDLs) while assimilating new object-management possibilities. Query and procedural languages and call interfaces in ORDBMSs are familiar: SQL3, vendor procedural languages, and ODBC, JDBC, and proprietary call interfaces are all extensions of RDBMS languages and interfaces. And the leading vendors are, of course, quite well known: IBM, Informix, and Oracle.

But what about database design tools? Have they been adequately extended or rewritten to help designers build database objects rather than just data structures? Do they model the complete set of new features -- user-defined types, functions, operators, complex objects, and inheritance -- as well as the use of vendor and third-party object modules in the form of Extenders, DataBlades, and Cartridges? Do their methodologies help you decide, for instance, when to create a referenced row type rather than a normalized one? Do their physical schema-generation capabilities create good DDL scripts? Do they speak SQL3 and understand the nuances of the IBM DB2 Universal Database, Informix-Dynamic Server (and its Universal Data Option), and Oracle8 system catalogs?

Before undertaking ORDBMS projects, whether transforming existing relational databases and applications or developing new ones, it is essential to understand the capabilities and methodological approaches of available tools. This article presents an overview of object/relational features that require new capabilities in modeling tools, sketches representative modeling problems, examines modeling methodologies, assesses three representative tools available, and analyzes future modeling and tool directions. I'll describe three DBMS-independent tools with object/relational modeling capabilities: OR-Compass from Logic Works Inc., InfoModeler 3.1 from InfoModelers Inc. (in late January, Visio Corp. announced its intention to acquire InfoModelers Inc.), and Silverrun Technologies Inc.'s Universal Modeler 1.0. I'll also briefly mention other modeling tools such as Platinum's ParadigmPlus that support ORDBMSs.

Object/Relational Features

Object/relational databases organize information in the familiar relational tabular structures. In fact, object/relational implementations subsume the relational database model. ORDBMSs are an incremental upgrade to their RDBMS predecessors, and, unlike the move to object database systems, object/relational migration will not necessarily entail wholesale recoding. Just as a C++ compiler will handle C code despite its lack of classes, once you accommodate the mostly syntactic changes from SQL-92 to SQL3, an ORDBMS should support same-vendor relational schemas. Current ORDBMS implementations do have gaps, however, such as the first-release lack of replication facilities in the Informix Universal Data Option and of inheritance in Oracle8, so implementation details must be kept in mind.

The most important new object/relational features are user-defined types (UDTs), user-defined functions (UDFs), and the infrastructures -- indexing/access methods and optimizer enhancements -- that support them.

User-defined types may be distinct, opaque (base), and row (composite). Distinct types, also known as value types, are derived from other types but have their own domains (allowable value sets), operations, functions, and casts: ORDBMS applications can be strongly typed, helping assure application integrity. System-defined (and not user-distinct) types are familiar: Most RDBMSs implement money as a numeric type with a defined number of decimal places. It is logical to add and subtract money values and to multiply them by scalars, but not by other money values. Money types could have functions such as growth (interest) and casts to real and string types if methods associated with those types are needed.

Opaque types are not derived from source types, so their internal structures must be defined to the DBMS along with their operations, functions, and casts. Once properly defined, an opaque type can be used as a source type for defining distinct and row types and can be used in tables. A row type is a collection of fields of other types; one row type can include another row type nested among its fields. Handling of the constituent fields is derived from their types; the operations, functions, and casts of the row type itself must therefore be defined. Collection types _ sets or lists of values of a built-in or user-defined type _ are another object/relational innovation. It is possible to simulate sets in relational databases, although one risks being criticized by purists who complain of repeating groups.

Cartridges, DataBlades, and Extenders are modules that build on the DBMS's object/relational infrastructure. They consist of types, data structures, functions, and data and often include special developer interfaces or prebuilt applications. (See "Extending Relational DBMSs" by Martin Rennhackkamp, DBMS, December 1997.)

Modeling Problems

To date, ORDBMSs have had their greatest success in managing media objects and complex data such as geospatial and financial time series data. They are frequently used in Web applications and specialized data warehouses. (This judgment is based on my experience starting in the early '90s with Ingres's Object Management module, Postgres, and Illustra, all ancestors of Informix-Dynamic Server, and with Oracle relational and OLAP databases.) Advanced Web applications are notable beneficiaries of the ORDBMS's ability to integrate management of media, traditional fielded data, and templates for dynamic page generation.

Media objects include audio, video, images, and formatted and unformatted text. The data structures themselves are not very interesting; to an ORDBMS, beyond defined access methods they are undifferentiated binary large objects (BLOBs). What is interesting is the new possibility of creating server functions to index, search, and process stored media. For instance, the Informix Web DataBlade includes not only a set of data structures for media-like application (template) pages, it also includes functions that process the SQL queries, variables, and procedural tags embedded in the HTML. (The SQL queries can, of course, invoke other server functions not related to the Web DataBlade.) Application pages are stored with a UDT, an opaque type called html, which has methods defined for their management and processing. The source type is just text; it's the methods that are interesting.

A time series, an ordered array of values indexed by a time value, is a representative complex data type. All the major object/relational DBMSs have or will have one (or more) time-series Cartridges, DataBlades, or Extenders. You can also build your own time-series structures and methods from the ground up. The time-series type, whatever its source, would have a name and other descriptive fields plus one or more date-indexed vectors of values. Redundant storage of the series identifier and dates is eliminated. Series values are no longer accessible via unextended SQL, however, so one would normally need functions and operators to extract and manipulate values and entire series; for instance, a GETVALUE function with parameters SERIESNAME and DATE. We would overload the PLUS operator to handle two cases, addition of a series and a scalar and pointwise sum of the values in two series.

The key lesson of these examples is that object/relational models include both data and processes: what information you have and what you are going to do with it. By contrast, relational databases support only limited encapsulation of operations and processes. Clearly, design methodologies and tools must now model both data and operations, a requirement that calls for approaches that cover both traditional databases and object-oriented applications and would optimally allow designers to both encapsulate functions with data and generate classes or code structures external to the database. Finally, tools should offer the ability to work with built-in types and methods and to extend the model with user-defined types and functions and Cartridge, DataBlade, and Extender modules.

Modeling Methodologies

There are many reasons for creating a model -- and for using a modeling and design tool -- rather than just jumping in and directly programming a database and application code. Modeling can help bridge the gaps between business concepts (conceptual models), database designs (logical models), and physical database implementations (physical models or schemas).

A logical model created with a design tool is insulated from DBMS specifics, allowing (in theory) a single design to be implemented in different DBMSs. Unlike logical models, schemas are tied to specific DBMS server products. For many years, modeling tools have had both reverse-engineering capabilities (deriving logical models from existing databases) and forward-engineering capabilities (generating database schemas from logical models).

Relational and Object/Relational Modeling

Entity relationship (ER) is the traditional relational modeling approach; information engineering (IE) and IDEF1X are variants with methodological and notational differences. Given the relational underpinnings of ORDBMSs, these three approaches have all been adapted for modeling object/relational databases, but they suffer serious conceptual modeling weaknesses and an inability to capture processes. For instance, an often-cited shortcoming of IE is difficulty depicting business objects that are naturally modeled in unnormalized form. The usual example, simplified, is an order, a single conceptual entity that IE would prefer to normalize into purchaser, product, and shopping basket entities.

Object Role Modeling

Traditional RDBMS modeling methodologies typically neglect the conceptual level. Object-role modeling (ORM) is a compelling methodology that runs counter to this rule. ORM concepts have been around for a number of years although the methodology has gained attention only relatively recently through Terry Halpin's work and its implementation in the InfoModeler design tool. The formal object-role modeling language (FORML) encapsulates ORM, taking a systematic, rigorous approach to capturing business concepts. FORML conceptual modelers verbalize data, symbolize or diagram it in sentence form, constrain the conceptual model through facts that instantiate statements, and validate the model.

Unified Modeling Language

There's a new database modeling alternative: the Object Management Group's Unified Modeling Language (UML). UML development was spearheaded by Rational Software Corp. based on the Unified Model for software objects, on so-called enterprise component modeling (ECM). ECM is a complex approach that includes conceptualization and requirements analysis covering conceptual modeling with mappings into classes, components, and distribution in the system and detailed design phases. UML's classes and methods, however, are roughly equivalent to an ORDBMS types and methods.

Modeling Tools for Object/Relational DBMSs

I will now provide an overview of three object/relational database modeling tools, Logic Works' OR-Compass, InfoModelers Inc.'s (Visio Corp.) InfoModeler, and Silverrun Technologies Inc.'s Universal Modeler. The descriptions will concentrate on object/relational modeling, including support for logical models and schemas (physical models). I ran these tools on Windows NT 4.0 and accessed an Informix Universal Server version 9.12.UC2 running on SGI/Irix 6.2.

Each tool lets you specify physical database properties -- spaces, locking mode, and so on -- of tables and other database objects. They also generate physical schemas, either directly to the database or, with the exception of InfoModeler, in data-definition language (DDL) files.

OR-Compass

Logic Works' flagship ERwin/ERX product implements ER modeling with the IDEF1X and IE notations, with forward- and reverse-engineering to all major RDBMSs, synchronization of logical models and physical schemas, a workgroup-oriented model-management system called ModelMart, and links to application-development tools including PowerBuilder and Visual Basic via ERwin/OPEN. (See Figure 1.) In addition, Logic Works offers a tool that exchanges metadata with the Rational Rose object-modeling tool.

OR-Compass is a new product and not simply an extension of ERwin/ERX; however, you can import an ERX model. I worked with OR-Compass betas and version 1.0 with Informix. Logic Works plans to add support for Oracle8, IBM DB2 Universal Database, and Sybase Adaptive Server in early 1998. OR-Compass uses an entity-relationship modeling methodology with IDEF1X notation.

OR-Compass includes two wizards that help to model object/relational features, a Row Type Wizard (RTW) and a Functional Index Wizard (FIW). The RTW appears to be aimed at designers who are migrating a relational database to an object/relational database: it allows you to collapse selected columns from a table into a row type. It does not create row types from the ground up. The FIW appears to be more useful, guiding you through specifying a table index based on values of a computed field.

ModelBlades are also a very useful feature. They consist of definitions and documentation for Informix DataBlade objects, which may be imported from ModelBlade files or DDL scripts. Imported ModelBlade objects are displayed in the Package explorer frame. They may then be included in user-created models.

OR-Compass supports automatic key migration when you create relationships (by right-clicking on a table name and choosing Insert | Outgoing Relationship or by using one of the Relationship icons). You can drag and drop a column from one table to another -- in the diagram but not in the Model explorer. Since row types are not displayed in the diagram, you can't similarly drag and drop their fields from one type to another anywhere. Drag-and-drop diagramming, whether from a palette (tool bar) or from the Model explorer, would make the interface much easier to use.

InfoModeler

InfoModeler Version 3.1 was the first modeling tool to support an ORDBMS with links to Informix. When Version 3.1 was released in August 1997, it introduced the ability to generate schemas for IBM's DB2 Universal Database. InfoModeler's raison d'ýtre is FORML conceptual modeling. It also supports ER and IDEF1X relational logical models.

A serious drawback is that InfoModeler has a very small collection of types built-in. To have access to a full set of object/relational types and functions, whether native to the ORDBMS or part of an extension such as a Cartridge, DataBlade, or Extender, you must have a live database connection, which inhibits offline work. Similarly, a live database connection is needed to forward- and reverse-engineer databases, that is, you can't work with just DDL scripts. These limitations unquestionably pose a problem for mobile or remote developers such as myself. I'm a consultant and do a significant portion of my work in my office away from client sites, and most companies do not allow external developers to access their database machines.

InfoModeler does not yet model server functions, and server functions along with user-defined types make up the ýobjectý in ýobject/relational.ý It will display functions that are already defined in a database -- if, and only if, you have a live database connection. This is obviously a Catch-22 situation; you can work with functions only if they already exist, but they won't exist unless you created them, and you can't create them with InfoModeler. Adequate server-function support is slated for a future release.

Universal Modeler

Silverrun 's Universal Modeler is a suite of interoperable tools including BPM for business-process modeling, ERX (Entity Relationship Expert) for conceptual relational modeling, and RDM for relational data (physical) modeling. The tools link to many RDBMSs and also to development tools including Delphi and PowerBuilder. Modeling teams can use Silverrun's Model Management Center as a model repository.

When I tested Silverrun Universal Modeler 1.0 (UM), it only supported Informix. The company stated that there would be an announcement about DB2 Universal Database support shortly and that the next version of Silverrun's Relational Data Modeling (RDM) tools would have extensions supporting Oracle8 and linking it to Universal Modeler's object-tool component.

Universal Modeler supports three modeling notations, IE, Silverrun (an ER version), and UML. You can begin creating user-defined types by choosing typeModule in the model explorer. You can also model types visually by going to typeModule in the explorer and right-clicking on Find in Modeler. The product also lets you visually model users and Informix Stored Procedure Language (SPL) and external routines. It would be helpful to have an SPL wizard, or at least some type of syntax guide, available.

Universal Modeler lets you import information about Informix DataBlades. (It calls this information "SilverBlades.") You can then use SilverBlade objects in models. However, Universal Modeler has limited ability to specify server functions. It does not allow you to specify negator and commutator functions or an optimizer cost. To create an index on a table column, I had to right-click on the table name in the explorer -- I couldn't find a way to create an index in the modeler -- and then drag and drop a field onto the index.

Other Modeling Tools

It is worthwhile to keep an eye on other modeling tools. Database-design tools that have not yet made the leap to object/relational modeling must add support for user-defined types and functions and other object/relational features.

An interim release Sybase Inc.'s PowerDesigner 6.1 (now shipping) added support for abstract datatypes with full object support and UML modeling to come in a mid-1998 release. Oracle Corp.'s Object Database Designer, a UML product that generates C++ classes and an object-extended database schema, is scheduled for release in early 1998. Oracle Designer/2000 is slated to gain object/relational modeling capabilities in a spring 1998 release. And it should not be a big leap for Popkin Software & Systems Inc. to build an object/relational modeling tool given the UML object modeling capabilities of SA/Object Architect.

ECM tools need to map classes and methods to their object/relational types and functions. They must also give designers the choice of deploying methods in application classes or encapsulating them in the database. ECM tools include Paradigm Plus (Platinum Technology Inc.), Rational Rose (Rational Software), and Select Enterprise (Select Software Tools). Rational's product already generates Oracle8 schemas, although the company has positioned it to complement rather than replace database modeling tools.

Platinum is ahead of the rest of the pack: UML-based ParadigmPlus now models and forward-engineers Oracle8 object extensions in the ECM detailed design and construction phases. ParadigmPlus creates logical and physical models that include abstract datatypes, attributes, and methods, mapping them to both Oracle8 physical schemas and software classes. It bridges the application object model and database model via a ýcommon metamodel.ý

A Mixed Bag

The representative modeling tools I described have much in common with the three major ORDBMSs. They are still immature and functionally incomplete.

In addition to their unique defects, none of the tools provides sufficient methodological or ad hoc logical-modeling assistance in deciding when to use types. The only serious defects, however, are InfoModeler's limitations in modeling functions. Silverrun's Universal Modeler could use some interface work but otherwise functions quite well. Logic Works' OR-Compass needs some functional and interface improvements such as visual display of types. Both Silverrun Universal Modeler and OR-Compass seem to be good physical modelers, although Universal Modeler is missing some function- and physical-modeling capabilities.

ORDBMSs have not yet entered the mainstream, and at this stage only very adept users -- those who most likely are willing to skip conceptual modeling -- will seek to exploit object/relational features. But conceptual modeling is valuable and I believe that InfoModeler is by far the best conceptual modeling tool. Perhaps a hybrid solution is in order for those who need and can afford it: use InfoModeler for initial conceptual modeling and create a logical model and forward-engineer it into a database schema. Then, reverse-engineer the schema into a logical model with a better logical- and physical-modeling tool and use that tool to tune your model.


Figure 1.
OR-Compass's modeling environment includes several explorer windows and a Property sheet in addition to a diagramming canvas. The Model explorer presents a hierarchical view of built-in and user-defined model elements, while the Package explorer shows elements imported via a ModelBlade, which corresponds to an Informix DataBlade. Both are linked to the Property sheet, which displays information appropriate to the type of element highlighted in the explorers.


Company Contact Information
  • Logic Works Inc., Princeton, NJ; 609-514-1177 or fax 609-514-1175; www.logicworks.com.
  • InfoModelers Inc. (recently purchased by Visio Corp.), Bellevue, WA; 425-637-2499 or fax 425-454-7696; www.infomodelers.com or www.visio.com.
  • Silverrun Technologies Inc., Woodcliff Lake, NJ; 201-391-6500 or fax 201-391-2210; www.silverrun.com.
  • Oracle Corp., Redwood Shores, CA; 650-506-7000 or fax 650-506-7200; www.oracle.com.
  • Platinum Technology Inc., Oak Brook Terrace, IL; 630-620-5000 or fax 630-691-0718; www.platinum.com.
  • Sybase Inc., Emeryville, CA; 510-922-3555 or fax 510-922-9441; www.sybase.com.
  • Popkin Software & Systems Inc., New York; 212-571-3434 or fax 212-571-3436; www.popkin.com.


Seth Grimes is a consultant in database and Internet design and development with Alta Plana Corp. You can reach Seth via the Web at www.altaplana.com or via email at grimes@access.digex.net.

What did you think of this article? Send a letter to the editor.


Subscribe to DBMS -- It's free for qualified readers in the United States
April 1998 Table of Contents | Other Contents | Article Index | Search | Site Index | Home

DBMS (http://www.dbmsmag.com)
Copyright © 1998 Miller Freeman, Inc. ALL RIGHTS RESERVED
Redistribution without permission is prohibited.
Please send questions or comments to dbms@mfi.com
Updated March 13, 1998