DBMS

Gazing into a Database Crystal Ball. Monitoring vendor and university research can help you prepare for emerging database technologies. By Peter Brooks
DBMS, April, 1998

What will a database server look like beyond the next major upgrade? This preview peers into the research and development (R&D) strategies of the major database vendors to provide a peek at future database technology being cooked up in their research labs. I will help you determine if your companyıs database strategy is aligned with the R&D directions pursued by your database server vendors and, consequently, whether you will be able to capitalize on the capabilities that your database products will gain over the next year or two. I'll provide a broad perspective on many areas of vendor and academic research, rather than a detailed analysis of only one or two areas, so that you can judge the overall research trends.

Popular areas of research and development are:

Over time, it will become increasingly important to align your organizationıs database strategy with that of your database vendors. Databases will become significantly more differentiated in the next two years. Even when vendors are focusing on the same areas of research, the results of the research will be new product features that require the use of proprietary interfaces for a particular database. For example, most of the major database vendors are building or expanding support for object/relational datatypes in their databases. However, the architecture, SQL interface, and system management of these datatypes are significantly different among, say, Sybase, Oracle, IBM, and Informix. A database built and an application written for one vendor will not be easily ported to another.

The benefits of significantly increased database functionality, then, will be balanced by having less application portability. The paradox of having less portable applications can be explained even with strong database vendor support for open systems: Vendors are supporters of database connectivity ı connecting other databases to theirs. They are not supporters of application portability ı being able to move data or applications from their databases to others.

This implies that there will be two primary models of corporate database strategy:

This article is based on my interpretation and summarization of research and conversations with senior staff at IBM, Informix, Microsoft, Oracle, and Sybase. While most companies were very willing to discuss their overall database research direction, they were less comfortable discussing specific product plans and time frames. Given the industry's success in meeting product announcement commitments, many people would say this shyness is understandable. Consequently, specific features may be mentioned only to give a perspective on what is currently being researched or developed. Mention of a feature should not be taken as a future product announcement.

Non-Vendor Research

Some interesting research is taking place at universities and research organizations. A peek at the topics being researched at the University of California, Berkeley; University of Colorado; Stanford; and the University of Massachusetts shows that academia is a strong supporter of database research. Bell Labs (now Lucent Technologies) is a major research institution. Of special note, the University of Massachusetts Web page (www-ccs.cs.umass.edu/db.html) has links to more than 100 research-related Web sites, mostly universities. Links are grouped into topics such as object-oriented databases, VLDBs, multimedia databases, and transaction models. There are also links to research conferences, journals, magazines, books, and newsgroups.

Your company should monitor the database research being undertaken in leading universities for two reasons:

Vendor Research

Let's investigate what's brewing in the labs at five leading database companies: IBM Corp., Informix Software Inc., Microsoft Corp., Oracle Corp., and Sybase Inc.

IBM

Befitting its size, IBM has the largest database technology research focus of the vendors I'll discuss. Particularly interesting areas of research include integrating OLAP and object/relational technologies into the core database server. IBM is also aggressively investing in building Internet Java development, communication, and database server capabilities. IBM's key research initiatives involve the following areas:

Core database research is focusing on adding what I will call application functionality and accessibility to IBMıs core database engine, the Universal Database Version 5.0, which merges the DB2/Parallel Edition and DB2 Version 2.0. (See Martin Rennhackkamp's "A DBA's View of DB2", DBMS, December 1997.) IBM's R&D in data warehousing includes dynamic bitmap indexing (created during a table scan) and star schema support. IBM will also enhance application functionality, improving the capability of DB2 version 5.0 commands such as "Cube" and "Rollup."

IBM is researching ways to allow applications to take advantage of the existing DB2 video, audio, image, and text extenders. This includes dynamically constructed Web pages and text mining ı the ability to use database capabilities to search through text documents stored on disk. One text-mining example identifies common complaints by using data mining algorithms to search through free-form text in letters received by a customer-service organization.

Application accessibility will be improved by better exploiting object model functionality. User-defined datatypes currently permit accessing object model data objects. In the future, IBM will support access via CORBA or ActiveX protocols so that, for example, database triggers could initiate email messages.

IBM's system administration research aims to improve the usability of the client platform and the ability to look across servers and connections without the need to know the specific databases on each server. A Web interface will allow access by any client. System management will become more active and automated using agent and alert technology. For example, the database could automatically initiate a reorganization process when required with no need for human intervention.

One area of performance research is in increasing the scalability of parallel systems based on Windows NT clusters. On the storage side, IBM is looking into large-scale hierarchical storage management with automated archival and retrieval. A customerıs database with a reported 200-plus terabytes of data and millions of images is being used to test the technology. DB2/MVS and DB2/400 systems will be able to connect via a gateway using TCP/IP in addition to SNA. This will support two-phase commits across multiple databases. The DB2 optimizer will be used across systems so that, for example, a join between tables on different servers will be accomplished by joining the small result rowsets of each table rather than requiring the transfer of entire tables across servers.

IBMıs Internet strategy, called Network Computing, contains products such as Net.Data, Net.Commerce, and Lotus Domino. In addition to continuing research in these products, IBM is also investigating innovations in content-provider rights. Originally, IBMıs Cryptolope technology used cryptographic techniques to protect Web-based content so that, for example, someone viewing a music video couldnıt copy the video and send it on to 100 of her closest friends. This technology has been significantly overhauled and now uses Java to bind business rules to Web content. The business rules can be used for security and access. In the future business rules will be used to assist users in filling out forms or to initiate workflow processing.

Informix

Informix is researching advances in application development, scalability, systems management, and application integration. Users of Informix databases should expect improved functionality and performance in the next year. Areas in which Informix is focusing database research include:

Right now, Informixıs Universal Data Option (formerly named the Informix-Universal Server) can store and manage text, image, and spatial data. Informix plans to increase the content-management capabilities of the Universal Data Option by incorporating message management. For example, Informix is designing an Event DataBlade that will store asynchronous message queues in the database with full transactional integrity. To the application program, messages in the queue will appear as tables so that queue access can be performed by database application developers using SQL rather than by C programmers requiring specialized message queuing routines.

Informix plans to enable customers to use Windows NT to host multiterabyte data warehouses. Data mining capabilities will be significantly improved by embedding data mining algorithms in DataBlades instead of forcing users to extract data for analysis by data mining tools.

Another research priority for Informix is obtaining transparent database replication and coordination between centralized and remote servers. The Informix Enterprise Command Center (IECC) will be enhanced with improved system management capabilities, such as more automated backups and recovery. Cross-system backup and recovery will be performed using CORBA- and IIOP-based protocols to initiate Java applets. Increased use of agents and action thresholds will allow IECC to automatically recognize and address problems without human intervention. Support for Windows 95, Windows NT, and HTML will allow IECC to be used by administrators anywhere on the network to monitor and diagnose problems ı even from home by simply dialing into the Web.

Database support for new clustering and MPP architectures will be used to improve performance by parallelizing queries to reduce the flow of data among platforms. Informix claims its database has achieved 95 to 99 percent linear throughput when adding additional nodes and that a 250 IBM SP2 node system achieved a 96 percent throughput. One additional area of performance research is eliminating client-side caching and improving lock management. Also, an optimizer runtime component will be used without requiring applications to be rebuilt to take advantage of new optimizer choices.

Microsoft

Microsoft expects to increase its R&D budget and staffing levels significantly, perhaps by 200 percent, in the next several years. While most companies emphasize applied research that requires a one- to three-year payback, Microsoft is funding basic software research with both a short-term and long-term result horizon. Among the short-term goals are increasing software usability and application integration and reducing the total cost of ownership.

Microsoftıs dedicated research organization, Microsoft Research, is pursuing several advanced projects in the database area:

Many of these research initiatives will require several years of research before results can be incorporated into commercial products.

Oracle

Oracleıs overall database philosophy is to foster the growth of network computing (as compared to the hardware devices called Network Computers) with its Network Computer Architecture (NCA). The NCA is a three-tier architecture using open standards such as CORBA, IIOP, and Java. One tier is the thin, lightweight client, one is the application server, and one is the database.

Network computing is larger than the database alone ı it includes a transparent distributed database environment with full heterogeneous transaction functionality. Capabilities will include multimode process isolation, load balancing, and transaction integrity without programming. Oracle hopes to support greater performance, reliability, and stability by concentrating functionality into application and database servers rather than end users' Windows computers. Oracle is migrating all its products to this model.

The implication of Oracleıs research is that over time, companies with Oracle databases should migrate to an NCA view of the world. This does not mean that all companies should begin installing thin-client hardware, but it does mean that Oracle customers should consider building a server-centric multitier (mainframes, client server applications, PCs, and NCs) database environment, even if NC devices are not used.

Oracle is increasing R&D in NCA tools. For example, in addition to supporting traditional Windows-based clients, Oracleıs Developer 2000 will build Network Computer applications that download Java applets as needed from shared application servers. To assist organizations in building network computing architectures and applications, Oracle has created a Network Computer Center of Excellence staffed by more than 100 consultants.

In addition to network computing, Oracle is also researching:

Oracle is seeking scalability and performance improvements in parallel processing and data warehousing. For parallelization, maximizing the performance of Oracle installed on Windows NT clusters is a priority. For data warehousing, Oracle is researching OLAP and multidimensional performance optimization through techniques such as using parallel bitmap indexing when performing queries that require star joins.

Oracleıs object/relational data Cartridge strategy calls for Oracle to build a fundamental set of six to twelve Cartridges such as the existing text, video, and time-series Cartridges, and then allow business partners to add value by extending these fundamental datatypes or creating new specialized Cartridges. Third parties will be able to build data cartridges that access Oracleıs database optimizer. For example, storing word processing documents in the database enables administrators to take advantage of traditional database capabilities such as automated backup and recovery and version control.

Oracle8 has message queuing built into the database. In the future, queuing will be more asynchronous. Oracle will adopt a "publish and subscribe" model based on Tibcoıs (www.tibco.com) technology to connect providers and users of information while minimizing queue transactions. Instead of queuing all available information, only information that users request will be sent.

Oracle's system-management research goals are to reduce the overall cost of ownership and decrease the number of manual errors by performing system monitoring and then taking action on "autopilot" ı without user intervention. Oracle is also creating wizards that will initiate and monitor much more sophisticated internode replication.

Sybase

Sybaseıs database philosophy is to expand the reach of its database on both the high end and low end so that corporate data can exist anywhere. Sybaseıs current research emphasizes integrating networks of centralized and remote databases. The intent is to optimize and customize data storage by using Adaptive Server Enterprise (formerly Sybase SQL Server) at the enterprise level, Adaptive Server/IQ (formerly Sybase/IQ) for data marts, and Adaptive Server Anywhere (formerly SQL Anywhere) for small devices.

Sybase's research areas include:

To expand its reach, Sybase is researching ways to connect transparently to other databases such as DB2 and Oracle in order to support distributed or federated queries. Sybase will create a global optimization capability that is significantly more powerful than the existing limited optimization of distributed database queries. Sybase plans to allow business partners to support and optimize the performance of nontraditional datatypes by using its component store integration layer.

One interesting research area is the evolution of Adaptive Server Anywhere using "Ultralight" technology to provide a very small database footprint ı under 1MB ı for small devices such as PDAs and cell phones that require traditional database reliability and integrity. Within this small footprint, Sybase plans to support BLOBs up to 1 or 2GB (depending on the operating system) as well as wireless two-way communication and replication.

Sybase is integrating Java support into the database so it can be interpreted and executed on any platform. CORBA support is also in the works.

Research in system administration seeks to relieve the human system administration requirements by automating many tasks. System administration programs will be increasingly Java based so they can run on any client using an enterprise JavaBeans architecture.

Sybase is researching data compression, optimization, and locking changes to improve its database performance. It is also investigating VLDB parallelization of both queries and backup procedures, concurrent updating, and improved row-level locking.

The Crystal Ball Becomes Clearer

Companies should monitor database R&D in order to align their own database plans with industry directions and to be poised to take advantage of emerging database capabilities. Otherwise, significant time and effort may be required to change applications or upgrade the database as new versions become available.

Among the major database vendors, IBM and Microsoft have, by far, the largest and most diverse R&D budgets. However, all database vendors are investing in research; bigger does not mean better. R&D budgets of high technology companies can range from 2 to 20 percent of sales. (See Table 1.)

Each vendor has at least one prominent database research focus:

Some R&D areas such as message queuing are becoming "the price of doing business" ı vendors that do not or will not have these capabilities will be at a competitive disadvantage.

To best plan their database future, companies need to monitor database R&D, identify capabilities that will provide a competitive advantage, and decide how aggressively to implement new technology. Careful analysis will allow companies to successfully adopt new technology. Being bowled over by every idea that comes along will cause companies to through good money after bad.


Peter Brooks, based in Boston, is a management consultant with Coopers & Lybrand Consulting's Integrated Strategic Services organization. He specializes in helping organizations expand strategically and competitively through the application of business intelligence systems, data warehouses, and Internet/intranet technology. You can email Peter at plbrooks@compuserve.com.


Table 1.
1996 R&D Expenditures

CompanyFiscal YearTotal Company R&D Expense
IBM Corp.1996$3.9 billion
Microsoft Corp.1997$1.9 billion
Oracle Corp.1997$555 million
Sybase Inc.1996$165 million
Informix Software Inc.1996$150 million

Source: Company 10K Filings


Company Information


What Do You See in a Database Crystal Ball?
After you read this article, I want you to be able to make a difference, rather than only being able to silently either compliment the vendors or rant and rave. If there is a particular strategy or set of features (not bug fixes) that you would like to see, please tell us ı email your request to Maurice Frank, DBMS' Editor-in-Chief, at mfrank@mfi.com. Your suggestions will be forwarded to vendors, and a future article will summarize reader suggestions. We will also send the results to the database companies that you mention. So if you have an idea for a specific company, mention it in your note. We canıt guarantee that all suggestions will be taken ı but we can guarantee that you will be heard!


What did you think of this article? Send a letter to the editor.


Subscribe to DBMS -- It's free for qualified readers in the United States
April 1998 Table of Contents | Other Contents | Article Index | Search | Site Index | Home

DBMS (http://www.dbmsmag.com)
Copyright © 1998 Miller Freeman, Inc. ALL RIGHTS RESERVED
Redistribution without permission is prohibited.
Please send questions or comments to dbms@mfi.com
Updated March 11, 1998