Using ODBC to Access Nontabular DataHow several vendors provide ODBC access to objects, documents, hierarchical, and network data sources.
Open Database Connectivity (ODBC) has become a household word, and for good reasons. Since Microsoft Corp. introduced ODBC in 1991 as an answer to the SQL Access Group's long search for a common database access interface, it has matured into the de facto common SQL interface standard for heterogeneous database access to tabular data. DBMS servers such as Oracle7 and Microsoft SQL Server, as well many desktop products such as Xbase data managers (dBASE, FoxPro, and others), Microsoft Access, and Borland' s Paradox, to name a few, all present data in a tabular form organized into rows and columns.
Unfortunately, not all data sources found within organizations support the notion of tabular data. What about data that lives in legacy hierarchical databases on the mainframe? What about data that resides within objects stored in an object-oriented data base? What about data that is stored in network model databases held together by intertwined pointers? Although these data sources fall outside the typical view of a SQL-driven application accessing and manipulating rows and columns of data, they are all data assets that can benefit from one common access mechanism.
Remember that ODBC's mission is to provide a common access interface using SQL. Figure 1 shows the general architecture of ODBC. The application in this case can be built from scratch with any development tool that can call a DLL f unction, or it can be a product that provides data access functionality right out of the box. Two examples of the latter approach are Microsoft Access and Business Objects, an end user query tool from Business Objects Inc.
The ODBC API is a collection of function calls such as SQLGetInfo, SQLFetch, and others that interact with the ODBC Driver Manager (a DLL usually found in the user's windows/system directory), and a driver supplied by the DBMS vendor or a third party. Th e Driver Manager loads drivers on behalf of the application. Applications access databases represented as ODBC data sources, and each ODBC data source must specify a driver. A data source in this case might be an Oracle database or a Microsoft Access MDB file. The job of the driver is to take the ODBC calls passed from the application and translate them into equivalent calls or protocol in the format expected by the target database. These are called two-tier drivers, and two examples are drivers for Ora cle and Sybase SQL Server. In some cases, the driver itself actually processes the SQL and retrieves the data. Borland's Paradox and Microsoft Access are examples of these single-tier drivers.
It is apparent that the driver is key to unlocking the mysteries of the target DBMS. All drivers are not created equal. For instance, there are at least six different Oracle ODBC drivers on the market today. Vendors of products that don't support tabular data must focus their efforts on mapping their data format into the tabular format accepted by ODBC.
EDA/SQL provides an ODBC driver to access IMS data. Figure 2 shows an example of a simple IMS data structure. Notice that the Customer segment, IMS's name for the entity, has two child segments, Order and Communication. There can b e many Order segments and many Communication segments for a single Customer. This hierarchical structure doesn't conform to a relational model of tabular data. Sure, all the columns within Customer are the same, but how can you retrieve customer name and address, as well as each order number and order amount? Or, how about a query to return the customer number followed by the average value of all that customer's orders sorted by zip code?
This is where EDA/SQL's ODBC driver and API/SQL API step in. The EDA/SQL ODBC driver translates ODBC API calls into IBI's API/SQL calls. In Figure 2, the application could view the three segments as three tables. The application generates SQL to join the Customer table and Order table to return the columns desired. Any ORDER BY or GROUP BY processing as well as aggregate and scalar functions are resolved on the mainframe, taking advantage of its horsepower.
Internally, IBI uses a mainframe-based dictionary called "EDA catalog" to determine what specific IMS access mechanisms should be used to resolve the query. The query is translated into IMS calls referencing the appropriate IMS segments. For you ex-IMSer s like me, for a specific Customer this may actually be a Get Unique (GU) call to the Customer segment, followed by Get Next Within Parent (GNP) calls for the related Order child segments. Additionally, to speed access the IMS database administrator may opt to provide a logical view of the two segments as one relational table, although this is highly denormalized. To ensure acceptable performance, the IMS database administrator must know how applications will access the IMS data in order to tune the EDA catalog entries.
IBI recently enhanced EDA/SQL to support INSERT, UPDATE, and DELETE statements against IMS databases accessed through ODBC, thus making it more attractive for more robust client/server applications. However, due to the nature of IMS, some core SQL DDL st atements are not supported by the EDA/SQL ODBC driver. These include table level statements such as CREATE TABLE, DROP TABLE, and ALTER TABLE. EDA/SQL cannot execute these statements dynamically because the internal architecture of IMS requires the use o f specialized administration tools on the mainframe.
Figure 3 shows an application that utilizes NotesSQL. A budgeting spreadsheet in Microsoft Excel needs access to sales forecasts from a Notes database. When invoked by Microsoft Query from within Excel, the NotesSQL ODBC driver (wh ich is delivered as two DLLs) takes the ODBC API calls from Microsoft Query and interacts with the Lotus Notes workstation software. Lotus Notes workstation and server components handle all network traffic and security. The NotesSQL ODBC driver can treat both Notes forms and views as tables. In Notes, forms define the fields on a document, and users create and update documents using forms. Views define and display a selection of documents, as well as calculated columns and sort specifications. Notes vie ws are indexed, and SQL queries on views can use these indexes to improve performance. If a SQL query needs a unique ordering or grouping not supported by an existing view, then a new Notes view should be created to utilize the Notes server's horsepower. Some Notes features are not accessible through the ODBC driver. For instance, Notes view columns that reference certain functions such as @DocLevel and @DocNumber do not return a value when you use the ODBC driver.
The NotesSQL ODBC driver version 1.2, which became available in October 1995, has gone a long way to improve performance over the initial release of the driver. Early performance problems were traced to the initial ODBC SQLConnect function, which dynamic ally cached the system catalog tables. Currently, NotesSQL does not support access to multivalued composite fields. According to Lotus, this was a conscious decision to avoid some rather arcane work arounds using ODBC. They have opted to wait for the wor k coming out of the SQL3 and X3H2 standards groups, which is addressing many of these issues and should be supported in ODBC 3.0. (See Ken North's article, "Understanding ODBC 3.0 Standards and OLE DB," in this issue.) By the t ime this article goes to press, Lotus will have released Notes 4.0 and version 2.0 of their NotesSQL ODBC driver. The new driver is now 32-bit as well as cross platform. Lotus also focused on tuning SQL optimization and index selection on the server side .
Like other products, the Illustra ODBC driver maps ODBC function calls to LIBMI (Illustra's API) and communicates these requests to the server. The standard ODBC interface defines a fixed set of data types, while the Illustra Server supports an extensibl e set of data types. However, you can map Illustra data types to ODBC SQL data types. The Illustra ODBC driver can bind an Illustra data type to a different Illustra data type before the Illustra server returns the data to a client. There are several way s to accomplish this, but the default method is to utilize a lookup table residing on the Illustra server. The server then executes functions that change the Illustra data type to an ODBC data type. If desired, the application can bypass the table lookup approach to bind user-defined data types dynamically by utilizing Illustra extensions to ODBC Option function calls such as "SQLSetConnectOption" and "SQLSetStmtOption."
It is important to realize that depending on what data is returned, such as spatial X,Y coordinates, the client will require logic that can deal with the result set. Illustra is currently working on intelligent objects packaged as OCXs that can be embedd ed within the client application to handle certain types of returned data. If you're a crafty C/C++ programmer and understand the layout of the returned object, then there is no problem.
In some cases, off-the-shelf tools such as Microsoft Excel already know how to deal with the data returned from an Illustra server. A good example involves using Illustra's Time Series DataBlade and ODBC through Excel. Using Excel's ODBC add-in (xlodbc.x la, not MS Query), you can request the 200-day moving average of a stock, and the data returned will be populated into a range of spreadsheet cells. (See Figure 4.) For example, the request could specify Informix's stock price from January 1, 1995 through July 19, 1995 for a 200-day moving average, as well as the high/low information for each day. This can be formulated as a SQL query and passed to the Illustra ODBC driver, which maps it to LIBMI calls on the server. Interacting w ith the Time Series DataBlade, the data would be cast from Illustra's Advanced Data Type (ADT), such as time series data, to an ODBC SQL data type. Once in Excel, the stock analyst can chart it and even overlay it on to another time series such as the NA SDAQ high/low over the same period of time. Again, this scenario uses tools already in-house, without needing additional specialized viewers or reporter generators using ODBC.
Object-oriented database management systems (OODBMS) store any type of complex object, while also providing acceptable levels of performance and a rich programming interface. Object Design Inc. has been shipping its ObjectStore OODBMS server since 1988. While ObjectStore supports both C++ and Smalltalk as native interface languages, at press time, Object Design plans to ship an ODBC driver in March of this year.
ObjectStore uses a schema mapper, which maps C header files to relational views. You can write methods to materialize a relational view from a collection of many different objects. In Figure 5, an ObjectStore Customer Object may al so encapsulate more than one Address Object. Both the Customer and Address Objects would have a C header file, and these header files would map either to one denormalized relational table representing rows of both Customer and Address attributes, or to t wo tables, one for Customer the other for Address. For example, you can use Microsoft Word and ObjectStore's ODBC driver to retrieve the data and generate customer invoices addressed to each customer's shipping address. Word sends the ODBC calls to Objec tStore, which then materializes the result set after consulting the schema mapper, as well as any unique mapping methods. ObjectStore has opted to stay with native ODBC support and does not provide any of its own extensions.
ObjectStore incorporates SQL access technology developed by Dharma Systems Inc. Dharma provides the Dharma/SQL ODBC driver and Dharma/SQL Access (a SQL engine that can be layered on top of any database environment) to companies such as Object Design that want to open their nontabular data stores to SQL access. Other companies using Dharma's technology include Intergraph, a CAD/ CAM vendor, and Objectivity Inc., another OODBMS vendor. Simba Technologies Inc., formerly PageAhead Software, also offers similar layering products.
Titanium uses Dynamic Pointer Arrays to manage complex relationships. Because Titanium is founded on the ANSI/SPARC three-schema architecture, mdbs strives to insulate the external view from the internal and conceptual views of the data. For relational s upport, Titanium uses a table definition file (TDF) to flatten the complex physical data structures. These tabular views are not actually materialized when the TDF is defined, but when the query is submitted either through mdbs' own API or through ODBC i n conjunction with the Titanium API.
In Figure 6, you can notice how Titanium's ODBC driver handles these complex many-to-many and many-to-one recursive relationships. The goal is to produce a report showing the managers' names, their employees, and the assigned tasks , using the popular Crystal Reports (from Crystal, a Seagate Software Co.) reporting tool, which supports ODBC. A SQL query references the relationship defined in the Titanium TDF script. Titanium's ODBC driver on the client maps ODBC calls made by Cryst al Reports to the Titanium API. These calls are then passed to the Titanium server where the relationship, defined in the TDF script, is mapped to the network schema's base records, so that the query set can be resolved. Titanium's ODBC driver currently supports select statements only.
Another network database vendor, Raima Corp. (Issaquah, Wash.) should have released an ODBC driver for their Raima Database Manager (RDM) by the time you read this. Raima took the SQL engine from their Velocis product and used it to build a product calle d the SQLizer, which creates relational views of the RDM data. RDM will also only support SELECT statements initially, but Raima promises full update support in a future release.
* Dharma Systems Inc., 15 Trafalgar Sq., Nashua, NH 03063; 603-886-1400 or fax 603-883-6904.
* Illustra Information Technologies Inc., 1111 Broadway, Ste. 2000, Oakland, CA 94607; 510-652-8000 or fax 510-869-6388.
* Information Builders Inc., 1250 Broadway, New York, NY 10001-3782; 212-736-4433 or fax 212-268-7470.
* Lotus Development Corp., 55 Cambridge Pkwy., Cambridge, MA 02142; 617-577-8500 or fax 617-693-3512.
* Micro Data Base Systems Inc. (mdbs), 1305 Cumberland Ave., West Lafayette, IN 47906-0438; 800-455-6327, 317-463-7200, or fax 317-463-1234.
* Object Design Inc., 25 Burlington Mall Rd., Burlington, MA 01803; 800-962-9620, 617-674-5000, or fax 617-674-5010.





