DBMS

DBMS ODBC Special Report Understanding ODBC 3.0 Standards and OLE DB

By Ken North
DBMS ODBC Special Report, April 1996

Microsoft's newest version aligns ODBC with the ANSI/ISO SQL-92 Call-Level Interface and adds numerous enhancements.


Open database connectivity (ODBC) is a call-level interface (CLI) for programming SQL applications. It has become the de facto standard SQL programming interface. Version 3.0, which should be available in early 1996, is a major upgrade that aligns ODBC w ith the de jure standard, the ANSI/ISO SQL-92 CLI. Microsoft used a specification from the SQL Access Group (SAG) as a baseline for creating a Windows database API. The original SAG CLI progressed through the standards process with X/Open (now SAG's pare nt) and U.S. and international standards organizations (ANSI and ISO, respectively). Each step in the process produced revisions, so the primary purpose of the next ODBC update is to align it with the new international standard. Microsoft unveiled ODBC 2 .0 approximately 18 months after introducing version 1.0. This upgrade has taken substantially longer, leading to conjecture that 3.0 is a bigger step. Version 3.0 adds 300 pages to the ODBC Programmer's Reference, which confirms that this release repres ents a major increase in functionality.

ODBC 3.0 adds support for Unicode (the 16-bit encoding standard that uses double-wide characters to support international character sets); better facilities for describing data, handling errors, and uninstalling components; and better handling of large b inary and text objects. It also adds batch enhancements, SQL-92 enhancements, and approximately 20 new function calls. SQLGetInfo, the ODBC function that returns information about available DBMS or driver features, returns information about approximately 40 new items. ODBC 3.0 expands the number of data types available to ODBC applications, and Microsoft has added SQL-92 scalar functions, including EXTRACT and CURTIMESTAMP, which were not previously supported. ODBC 3.0 changes the datetime type and adds types for Unicode, intervals, and large objects (character or binary).

ODBC In Operation

ODBC applications operate on data sources, an abstraction that includes information such as location, network libraries, client libraries, database name, and similar unique identifiers. The amount of information needed to specify location is context depe ndent. For text data on a local hard drive, it may be simply path information; remote servers may involve specifying domain names, node names, IP addresses, or other things. The ODBC Administrator application allows users to define and use data sources s uch as "Subscribers" or "Vibration Tests." It associates each data source name with an ODBC driver and all of the information necessary for the driver to get to the data.

The ODBC driver is a software component that implements calls to ODBC functions by performing the appropriate operations against the data source. ODBC drivers are dynamic link libraries (DLLs) in Windows and OS/2, whereas they are shared libraries for Un ix and Macintosh. ODBC drivers for desktop databases are single-tier drivers that incorporate the driver logic with an optimizing SQL engine. Some ODBC drivers use proprietary SQL APIs such as Oracle Call Interface, Sybase, or Microsoft DB-Library. Other drivers don't use a native library. This category is growing as more DBMS vendors implement the ISO CLI as a native interface. Microrim Inc., White Cross Systems, and Watcom International Corp. had early implementations that use SAG/ODBC as a native CLI , but more recently companies such as Tandem Computers Inc., IBM Corp., and Informix Software Inc. added the standard CLI to their DBMS products. Some drivers implement ODBC on top of a native API, meaning that the ODBC driver calls functions in the nati ve library. Others are a functional replacement for, and operate at the same level as, the native library, so they are capable of equivalent or better performance than the native API. The Microsoft SQL Server Driver emits tabular data stream (TDS) packet s directly without using DB-Library. Microsoft has optimized performance of this driver to the extent that SQL Server 6.0 TPC-C benchmarks used ODBC as the client library.

SQL-92 and ODBC use levels to identify groupings of functionality. SQL-92 refers to implementation levels (entry, intermediate, and full), while ODBC documentation refers to the minimum, core, and extended SQL grammar and the core, level 1, and level 2 O DBC API. Applications can make calls at runtime to determine the conformance levels of a driver and DBMS. ODBC uses SQL to access data, and it permits the use of escape clauses and parameter markers in SQL statements. Escape clauses provide extensions to SQL for date, time, and timestamp expressions, outer joins, procedure calls, LIKE predicates, and scalar functions. Parameter markers enable execution time substitution of values for parameters. An application uses SQLBindParameter to supply the data to substitute for a parameter marker.

ODBC manages a variety of resources on once-per-application, once-per-connection, and once-per-SQL statement bases so it uses handles to track resources at each level. An application requires a single environment handle, but it may use multiple connectio ns and multiple SQL statements per connection. ODBC 2.0 includes functions to allocate and free connection, environment, and statement handles, but ODBC 3.0 adds a more flexible, generic handle capability. The new SQLAllocHandle function includes an argu ment that specifies the type of handle to allocate. Since its inception, ODBC has provided asynchronous processing, or nonblocking mode, as a connection option. ODBC 3.0 adds support for asynchronous processing of individual statements. SQL-92 defines fo ur isolation levels: read uncommitted, read committed, repeatable read, and serializable. ODBC supports all four and adds versioning as its fifth isolation level.

ODBC includes a variety of functions to get specific metadata and 3.0 adds extensible capabilities to describe data. To get catalog information about tables and other objects, an application calls several functions such as SQLTables, SQLTablePrivileges, SQLProcedures, SQLPrimaryKeys, SQLForeignKeys, and SQLStatistics. Similarly an application can call SQLColumns, SQLColAttributes, SQLDescribeCol, SQLColumnPrivileges, and SQLProcedureColumns for information such as the name, type, scale, and precision of a column. ODBC 3.0 descriptors are based on SQL-92 and its SQL Descriptor Area (SQLDA). Programmers using dynamic SQL are familiar with the process of allocating a descriptor area (SQLDA) and using SQL's DESCRIBE statement to fill in the SQLDA. SQL-92 s tandardized the SQLDA format, which permits ODBC 3.0 to introduce an analog to the dynamic SQL DESCRIBE process. Attributes and Parameters ODBC 3.0 modifies some terminology such that the new term for "options," as in "connection options" and "statement options," is "attributes." The change means that the statement option for bypassing the driver's scan for escape sequences in SQL (SQL_NOSC AN) becomes a statement attribute (SQL_ATTR_NOSCAN). New functions for getting environment, connection, and statement attributes are SQLGetEnvAttr, SQLGetConnectAttr, and SQLGetStmtAttr, respectively. There are also new counterpart functions that set att ributes.

There have been several changes concerning parameters. Applications will be able to identify which parameter in an SQL statement causes an error when calling SQLExecute and SQLExecDirect. Applications will also be able to use row-wise binding of paramete rs. Figure 1 illustrates how ODBC 3.0's row-wise binding of parameters provides the opportunity for a symmetrical relationship that permits the use of row-wise bound data as parameter values. Currently, bookmarks are 32-bit values, but the next release of ODBC will support variable-length bookmarks. Another big step in ODBC 3.0 is support for Unicode.

Error Handling

Some SQL implementations include the WHENEVER clause to specify that an error-handler be activated when an application encounters an error. This technique is cumbersome when an application contains numerous SQL statements, because the application must so rt out the source of an error while also detecting and responding to SQLSTATE and SQLCODE information. SQL-92 added a diagnostic area, plus GET DIAGNOSTICS and GET DIAGNOSTICS EXCEPTION statements, to improve generic error handling based on SQLCODE and S QLSTATE values. SQL-92 defines the diagnostics header area as containing several fields (NUMBER, MORE, COMMAND_FUNCTION, DYNAMIC_FUNCTION, and ROW_COUNT), so ODBC 3.0 defines identifiers that reference each of the header fields (for example, SQL_DIAG_DYN AMIC_function). ODBC also implements the diagnostic area as an extensible structure.

Two new functions, SQLGetDiagRec and SQLGetDiagField, return diagnostic information associated with a descriptor, environment, connection, or statement handle. An ODBC 3.0 application, like its 2.0 counterpart, is likely to use a loop to process error an d status information. An ODBC 2.0 application loops while calling SQLError until SQLError returns a value (SQL_NO_DATA_FOUND) indicating there is no more error information. With ODBC 3.0, the number of iterations depends on a count of status records main tained in the diagnostic header record and accessed by calling SQLGetDiagField. The diagnostics data area also indicates which column (or columns) triggered errors. To align the state indicator (SQLSTATE) with the international standard, ODBC 3.0 changes all SQLSTATE codes that formerly started with S1 to HY, and all that started with S00xx to 42Sxx. ODBC 3.0 also adds an environment attribute that permits a programmer to specify whether SQLSTATE behavior should be consistent with ODBC 2.0 or the new st andard.

Descriptors, Binding Enhancements, and Batch Processing

ODBC 3.0 greatly expands the ways you can obtain information about columns and parameters. In addition to a data structure for descriptors, ODBC 3.0 provides two ways to view data: as seen by the application and as seen by the driver. The application cre ates and destroys the new descriptor data structure by using SQLAllocHandle and SQLFreeHandle, respectively, although each allocation of a statement handle implicitly allocates a descriptor. Descriptors add several new words to the ODBC user's lexicon. A n application row descriptor (ARD) describes rows as seen by the application, that is, after conversion by the driver. An implementation row descriptor (IRD) describes rows as seen by the driver. Similarly, application parameter descriptors (APD) and imp lementation parameter descriptors (IPD) describe parameters as seen by the application and driver, respectively.

A descriptor table includes one row per column or parameter being described. Descriptors include information such as name, type, length, precision, scale, returned length, collation sequence, and whether a column is nullable. The attributes of each descr iptor type (ARD, IRD, APD, and IPD) are unique, although some attributes, such as type and length, are common. To get descriptor information, an application uses SQLGetDescField and SQLGetDescRec, and to set information, an application uses SQLSetDescFie ld and SQLSetDescRec.

At its October 1995 ODBC 3.0 Design Preview, Microsoft introduced the terms "concise" and "extensible" to classify functions based on their ability to manipulate descriptors. Concise functions do not require an application to allocate a descriptor handle , yet they set or get multiple fields in a descriptor or diagnostic area. Concise functions are not extensible and they are conceptually similar to a macro in that a single function such as SQLBindCol is the equivalent of several calls to SQLSetDescField . Concise functions also provide backward compatibility, because the list of concise functions includes SQLDescribeCol, SQLSetScrollOptions, and SQLNumResultCols for rows, and SQLDescribeParam, SQLParamOptions, and SQLNumParams for parameters. ODBC 3.0 d escriptors are extensible; for example, a driver can implement driver-specific fields. To support that extensibility, ODBC 3.0 includes SQLSetDescField and SQLGetDescField, which provide the ability to set and get individual descriptor fields, including fields that are driver-specific.

ODBC provides the ability to use static, dynamic, keyset, mixed, and rowset cursors. An application uses a rowset cursor for block or array fetches. The SQLExtendedFetch function provides a block fetch capability, and the ODBC Cursor Library provides a c lient-side emulation of block fetches for drivers that do not implement SQLExtendedFetch. ODBC 3.0 adds SQLFetchScroll, a new function that provides block and scrollable fetches with the ability to fetch a single rowset based on a fetch orientation and o ffset value. Programmers using ODBC 2.0's SQLExtendedFetch monitor a row status array, but when using SQLFetchScroll, the programmer can take advantage of descriptor fields. For example, to set the array or rowset size for rows or parameters, a programme r calls SQLSetDescField using the field identifier SQL_DESC_ARRAY_SIZE. SQLFetchScroll also provides the ability to fetch by a search condition using new fields in the ARD that point to a predicate and predicate length. In a typical scenario, an applicat ion will set PREDICATE_PTR and call SQLFetchScroll to find the next or previous row that matches the search condition.

ODBC supports batch operations, and 3.0 returns new information about a back end's ability to support batch operations. When used in the ODBC context, batch operations include stored procedures, bulk operations that use single statements with an array of parameters, and multiple statements delimited by a semicolon. ODBC has always supported bound fetches whereby an application binds table columns to program variables. ODBC 3.0 adds row-wise parameter binding, out-of-line binding for variable length data , and quick rebinding via offsets.

Locators, SQL3, and Nested Results Sets

Another major new feature is support for SQL3 locators, 32-bit runtime token values that provide access to large binary and character objects. ODBC 2.0 applications typically used SQLGetData and SQLPutData to store and retrieve long variable length data. ODBC 3.0 includes five new functions that work with locators and support located update operations. An application will obtain a locator by calling SQLBindCol or SQLGetData and find the length of the large object by calling SQLGetLength. ODBC 3.0 provid es two functions (SQLGetPosition and SQLGetSubString) to locate and extract substrings in large object columns containing text. The new SQLLocator function permits an application to specify whether to hold or free a locator after its parent transaction t erminates.

ODBC 3.0 contains other features that are consistent with SQL3, the standard that will succeed SQL-92. SQLGetInfo adds a return (SQL_CREATE_TRIGGER) to indicate whether a back end supports CREATE TRIGGER, which is part of SQL3. User-defined abstract data types will be in SQL3 and SQLGetInfo added SQL_CREATE_ADT to indicate whether the back end supports abstract data types. ODBC 3.0 also introduces support for nested result sets similar to those produced by SQL Server's COMPUTE by and Teradata's WITH ope rations.

Adaptive Programming

My article, "Understanding ODBC and Multidatabase APIs," in the March 1994 DBMS explained the concept of adaptive programming whereby ODBC applications make execution time calls to get information about driver, database, types, and other features. SQLGetTypeInfo adds new data types while SQLGetFunctions, which returns the functions that a driver supports, adds the new 3.0 functions. ODBC's SQLGetInfo function returns data about features such as default isolation levels, the availability of stored procedures, and cursor context preservation when a transaction completes. ODBC 3.0 adds almost 40 new information returns that are useful for writing adaptive code. SQLGetInfo adds returns for batch processing, schema views, clause enumeration for CREAT E and DROP statements, locator updatability and persistence information, and more.

OLE DB and Multiplatform Applications

OLE DB, the data integration element of Microsoft's Enterprise OLE strategy, uses ODBC to access data from relational data sources. OLE DB is a technology that will integrate data from heterogeneous sources such as spreadsheets, project plans, SQL, and l egacy databases. It is still in development, but Mike Pizzo, Microsoft senior ODBC program manager, reports that it will be capable of using existing ODBC drivers. ODBC already supports a variety of non-SQL data. (See Paul Reed's article "Using ODBC to Access Nontabular Data,".) OLE DB will add support for additional sources as Microsoft adds providers for data sources such as Microsoft Exchange, Word, and Project data.

The current ODBC architecture uses a Driver Manager to provide services to ODBC-enabled applications. For the future, Microsoft is working on new software, code-named Kagera (see Figure 2), that will work with existing ODBC drivers and OLE data providers that access nonrelational data. There are ODBC drivers for multiple operating environments, not all of which will have a full implementation of OLE, so the new 3.0 drivers must operate in a plug-and-play manner with either Kagera or the ODBC Driver Manager.

The proliferation of ODBC in the Windows database world and the emergence of a CLI standard from X/Open, SAG, ANSI, and ISO has contributed to the demand for ODBC on other platforms. Visigenic Software Inc. expects to ship the ODBC 3.0 SDK for HP-UX, AIX , OS/2, SGI, SCO, Macintosh, PowerMac, and Solaris in the same period that Microsoft ships the SDK for Windows users. Intersolv Inc., OpenLink Software Inc., and Visigenic Software Inc. are major suppliers of ODBC driver packs so they will probably relea se new drivers soon after the release of 3.0. However, it isn't clear how quickly we'll see 3.0 drivers from vendors that supply a driver for their own DBMS, or how quickly we'll see ports of OLE to other operating environments. Microsoft has signed lice nsing agreements with Software AG of North America Inc. and Digital Equipment Corp. to port OLE Integration, the non-GUI core technologies of OLE, to non-Windows operating systems.

My article, "Multidatabase Development" (DBMS, October 1994, page 94), explained Microsoft's Data Access Objects (DAO), a collection of objects providing data access to Visual Basic, Access, and Visual C++ programmers. The original implementation was an object layer that used the Access Engine for ISAM and ODBC data, but Microsoft is evolving DAO so that it will eventually work with OLE DB.

DAO and OLE DB provide objects that encapsulate API functionality and simplify programming by raising the level of abstraction. However, experience with VBXs and other component technology indicates that methods and procedures exposed initially by the fi rst generations of components do not provide all of the functionality available by programming with call-level APIs. In simple terms that means that database programming with the ODBC API is likely to be with us for some time as companies refine their ob ject technologies.


Ken North is a software developer, consultant and the author of Windows Multi-DBMS Programming (John Wiley & Sons Inc., 1995). You can email Ken at 71301.1306@compuserve.com.

Figure 1


--ODBC 3.0 introduces row-wise parameter binding, which permits bound row data to be used as parameter values in multirow operations.


Figure 2


--Microsoft's ODBC/OLE DB Driver Manager (Kagera) will be an ODBC data provider for OLE DB. Kagera is compatible with today's ODBC drivers.


Subscribe to DBMS and Internet Systems -- It's free for qualified readers in the United States
April 1996 Table of Contents | Other Contents | Article Index | Search | Site Index | Home

DBMS and Internet Systems (http://www.dbmsmag.com)
Copyright © 1996 Miller Freeman, Inc. ALL RIGHTS RESERVED
Redistribution without permission is prohibited.
Please send questions or comments to dbms@mfi.com
Updated Monday, October 14, 1996