DBMS
 

 

Taming Data Giants, Part 2

By Stephen Brobst and Owen Robertson
DBMS, March 1997

Managing massive databases requires specialized strategies and techniques.


In part 1 of this article (see "Taming Data Giants," Part 1, DBMS, February 1997, page 38), we discussed several factors that contribute to the galloping growth of very large database (VLDB) implementations. In this article, we review critical VLDB administration issues such as performance management and high availability. We also explore the role of indexing strategies, query optimizers, parallel processing, scalability, and three-tier architectures as strategies you can use to effectively manage VLDB implementations. Finally, we describe how several leading RDBMSs support very large databases.

Partitioning Strategies

So how do you manage a table with billions of rows? Partitioning is the bottom-line answer. The only way to effectively manage billions of rows in a single table is through some form of partitioning within the database. Almost all databases targeted to the high end provide some form of partitioning approach in their implementations. Three primary techniques are used to facilitate partitioning: hash partitioning, key-range partitioning, and round-robin partitioning. The hash partitioning technique requires a DBA to select one or more columns for each table to be used as the input to a hashing function that will decide on which database partition each row of a table will be assigned and stored. A simple hashing function might be to compute a numeric value for the column(s) for each row of a table according to the selected hash key and then take a modulus by the number of partitions for the table to assign a given row to its partition. This, of course, would not be a very good hash function for the purposes of an actual implementation, because it is too simplistic to provide the degree of randomization necessary for a good distribution of data across the partitions (you cannot count on the randomness of the last digit in a key value). A good hashing function will be both efficient and relatively balanced in its distribution of a wide variety of hash keys across database partitions.

Key-range partitioning will distribute data rows across the database partitions according to a pre-defined key range associated to each partition. Some databases will require that the key-range partitioning be performed only on the (unique) primary key for a table. Database maintenance can be dramatically simplified using key-range partitioning. Reorganizations and index rebuilds can usually be undertaken on a partition-by-partition basis to help manage the scope and impact of such tasks. Key-range partitioning is often performed on large tables using a date timestamp column as the partitioning key. This has the built-in advantage of providing a simple way to age data out of the database by dropping the oldest partition out of a table to make space for incoming data for the next time period to be stored. This offers a straightforward method of keeping a rolling number of time periods of data without requiring burdensome database maintenance.

The alternative, without the key-range partitioning, is to issue a mass delete operation using a where clause to qualify rows by date range that are to be purged from the table. If a large number of rows is to be deleted, the operation will often need to be broken out into many steps (sometimes one day of data at a time between commit points) to avoid blowing up the database because of a lack of log space. Notice that in hash partitioning, even if the date timestamp is the partitioning key, the randomization of data rows across the partitions makes it impossible to isolate the rows to be deleted to a single partition for purging purposes; a mass delete operation is the only choice in such implementations. Finally, smart database engines often can avoid scanning or indexing into large portions of a key-range partitioned database if the partitioning key is contained in the where clause or as a join key in a SQL query. The partition elimination is performed by recognizing the mutually exclusive data content for the partitioning column across partitions.

Round-robin partitioning is the simplest model of data placement of the three; rows are merely placed into partitions as if the database were dealing cards to players on a rotating basis. The main advantage of the round-robin technique is that the data is distributed very efficiently when performing high-volume loading or insert-selects into a large table. With simplicity comes speed, and the round-robin technique can distribute rows into a target table approximately five percent faster than the hashing method of partitioning.

Most DSS-oriented DBMS implementations make use of hash partitioning to ensure an even distribution of data across partitions and to provide co-location of joins when two tables are partitioned on the same key. Co-location of a join occurs when the join column in an equijoin between two tables uses columns that also represent the partitioning keys for the two tables. Co-location allows complete locality within each database instance when invoking a parallel join on partitioned tables, because the hashing function will ensure that any two rows hashed on the same join key must coexist in the same database partition. By definition, two tables cannot be stored in different partitions if they are co-located. The co-located join therefore avoids any need for communication of data between partitions across the database instances managing these partitions.

In a shared-nothing DBMS implementation where data partitions are explicitly assigned to DBMS instances, with function shipping as the means for passing data between instances, co-located joins can be executed significantly faster than joins that are not co-located. If a join is not co-located (the join key between the tables to be joined is not the partitioning columns for one or both tables), then an on-the-fly redistribution of data across the database instances must be undertaken to align the two tables for execution of the join. The cost of the redistribution can vary widely across database products and will depend heavily upon the size of the table(s) to be redistributed and how many rows can be filtered locally based on application of where clause predicates. In databases, such as Informix Software Inc.'s XPS, where communication between database instances is very efficient, the performance penalty for joins that are not co-located can be quite small (the typical penalty for a join that is not co-located in Informix XPS is less than 10 percent compared to a co-located join). On the other hand, shared-nothing database implementations that have not optimized for the cost of full table redistributions will often pay a factor of two or more in performance penalty in comparison to co-located joins.

As we mentioned previously, most DSS-oriented database products make use of hash partitioning techniques for their VLDB implementations. NCR Corp.'s Teradata, IBM Corp.'s DB2/6000 Parallel Edition, and Sybase Inc.'s MPP all fall into this category. Tandem Computers Inc.'s Non-Stop SQL and IBM's DB2 V4 on MVS, both of which have good implementations for high-end OLTP workloads, make use of key-range partitioning for large tables. The Informix database engine allows any of the three partitioning options for its high-end implementations. The Oracle database engine will not support partitioning until its 8.0 release, but it offers some clever workarounds in version 7.3 for the interim. A simulated partitioning is achieved by constructing separate physical tables for each desired partition of what would logically be considered a single table. To present a single table image to application programs, a view is constructed with union all operators between each of the physical tables to provide the illusion of one large table. In previous releases of the database, the cost of view materialization would have been prohibitive for any type of SQL query launched against such a construct.

However, Oracle Corp.'s Oracle 7.3 has introduced some optimizer intelligence and syntactic extensions to allow for relatively efficient handling of access to the unified tables underneath the view construct. In this partitioning scheme using physical tables to implement the partitions, the separate tables typically divide the data rows according to key-range values as the basis for partitioning. In Oracle 7.3, the view construct allows a syntax in which the key-range values associated to a table can be specified as part of the view definition (the specification is not enforced, however, and therefore is more of a hint to the optimizer than an actual view constraint). The optimizer uses the specification in such a way that any access to the data in which the partitioning key is used can be made significantly more efficient by eliminating consideration of any table in the UNION ALL view that does not contain any rows with the desired key value(s). Those tables that potentially have qualifying rows must be materialized into the view, but as many WHERE clause predicates as possible are pushed down into the filters against individual tables to eliminate as many rows as possible before bringing them into the view materialization. Oracle can exploit parallelism against even a single partition (table) of the UNION ALL view using parallel query servers in a shared-everything database architecture.

Parallel Processing

Most database vendors have figured out how to deliver scalable performance using parallel-processing hardware against VLDB implementations for decision-support purposes. However, scalability for OLTP applications continues to be elusive at the very high end in relational databases. Within a single database instance, the limit continues to be pushed very hard by the high-end relational database vendors. In recent benchmarking activities, we observed very good scalability of the Oracle 7.3 implementation with more than 40 CPUs on a Cray Research 6400 SMP machine (this division has since been acquired by Sun Microsystems Inc. in the post-merger reorganization of Silicon Graphics Inc. and Cray Research Inc.). There are, however, limits to the scalability that you can achieve within a single database instance. Serialization through locking code and other critical sections constrains the inherent scalability of single-instance implementations. One of the most important innovations in Oracle 7.3 for high-performance OLTP was the introduction of multiple free-lists for block allocation to cut down on contention for this resource. In the previous version of Oracle 7.2, scalability was limited to around 30 CPUs for high-volume OLTP within a single database instance because of the queuing that would take place in competition for (exclusive) access to the free-list for block allocation. The multiple free-lists in 7.3 all have their own locks, thereby reducing contention for what was previously a bottleneck in high-end OLTP.

Although single-instance database implementations on SMP architectures continue to deliver better performance by removing serialization points and implementing more efficient threading mechanisms, multiple database instances continue to have limited success in scaling at the high end on clustered or MPP machine architectures for OLTP workloads. In a multiple-instance database implementation, a single-image database is operated upon by multiple, cooperating database processes. In contrast to a distributed database implementation, in which each database instance is relatively independent, the implementation of multiple database instances in Oracle's Parallel Server environment or Informix XPS has the instances sharing a single view of the database image, rather than separate islands of data.

Within the space of multiple-instance database implementations against a single database image, vendors can choose to deploy either shared-everything or shared-nothing database architectures. A shared-everything database architecture enables all database instances to address and perform I/O against any data block in the database; partitioning is independent of the multiple database instances -- although some implementations may permit an affinity to take place between database instances and table partitions to influence which instances access specific database partitions. In a shared-nothing database implementation, database instances are explicitly assigned to table partitions. I/O to data blocks within a partition must be performed by the database instance assigned to the partition. Any database instance that requires data managed by another database instance uses a function-shipping model to request database objects (not data blocks) to be delivered from the owning instance. Filtering of the data blocks to eliminate unneeded rows and columns is performed locally in the owning instance before shipping result sets back to the requesting instance.

Strengths and Weaknesses

Each approach has some fundamental strengths and weaknesses. For high-end OLTP workloads, the primary Achilles heel for the shared-everything database architecture is maintaining database buffer cache coherency between instances. Because any database instance can access any block in the database, it will often be the case that the same database block will be cached in the buffer cache for many instances. This overlap results in some technical challenges in maintaining coherency of database content between instances if one instance updates its copy of a data block and another instance has an out-of-date block cached in its buffer. To prevent this scenario, the shared-everything database implementations must put into place some form of lock manager to coordinate exclusive locks for updating shared data blocks. If one instance wants to read or write to a data block that has been updated in another instance's buffer cache, the most up-to-date version of the block that has been updated in the currently owning database instance must be flushed out of the buffer and made available for access by the requesting instance.

Depending on the implementation, flushing a data block from a database instance will often require writing out the block to disk to ensure recoverability before passing the block for further updates to another instance. In a very high-volume update environment, as is often typified by OLTP applications, this database buffer cache coherence mechanism will commonly result in contention over frequently updated blocks (especially index block headers) in the database, resulting in massive performance degradation as instances struggle to obtain access and maintain coherency for high-demand data blocks. This effect is affectionately referred to as the block-ping problem among those who are frequently called upon to battle this problem. The impact is severe because those blocks that are updated most frequently are penalized the most in the block-ping scenario. Instead of accessing these high-demand blocks at memory speeds in a database buffer cache, access is gated though lock manager requests and disk I/Os required to flush the block from the buffer cache of one instance before bringing it into the requesting instance's buffer cache. The difference between memory access to a data block and disk I/O access is three orders of magnitude -- quite a price to pay in an OLTP environment for access to the most frequently updated blocks.

Three-Tier Application Architectures

So how are VLDB systems with high performance demands addressing this issue in shared-everything database implementations? The answer is through use of three-tier application architectures, with intelligent deployment of middleware as the critical component for managing performance and access to the database resources. Using two-tier architectures, with 4GL GUI builders providing automatic code generation for database access, is often alluring because of the simplicity of the approach. In fact, this architecture works very well for small departmental systems and for pilot implementations -- particularly because of the productivity advantages of the 4GL GUI tools in helping to get a system coded and onto the end-user desktops very quickly. The problem is that the two-tier architectural implementations simply do not scale to the high volumes that most VLDB implementations demand. Beyond 100 or so (actively) concurrent users, the two-tier model will simply fall apart for an aggressive workload against a VLDB system. Today's database products cannot support the ability of shared-everything database engines to handle a large number of direct connections or deal with the block-ping problem.

Transaction Processing (TP) monitors provide a robust middleware layer to insulate the database engine from these issues. Nearly all successful OLTP applications at the high end of the performance spectrum will be implemented using some form of TP monitor. TP monitor products such as BEA Systems Inc.'s (Sunnyvale, Calif.) Tuxedo and Transarc Corp.'s (Pittsburgh, P.A.) Encina (Transarc is now owned by IBM) dominate market share in the Unix environments, whereas CICS dominates in the MVS marketplace (with a small amount of Unix penetration). The advantages of a robust middleware architecture are many. First of all, business applications logic is explicitly separated from the presentation and database access layers of the application to yield a much cleaner modularization of the systems software. Allowing encapsulation of applications objects in the middleware layer in turn allows a level of reusability and flexibility of deployment that is not possible when presentation and business logic are intermingled. The encapsulation of applications objects in the middleware also supports deployment using a thin-client approach rather than forcing monstrous desktops -- which are usually required with two-tier application implementations. When implemented properly, the middleware architecture is particularly well suited for Web-based deployment, because it separates presentation and business logic.

The block-ping issue is addressed explicitly using the data-dependent routing (DDR) features provided in most TP monitors to allow for intelligent routing of transactions to specific database instances. This approach maximizes instance locality in reference patterns so as to avoid contention over frequently accessed blocks. The three-tier architecture support permits a much higher degree of applications scalability because the TP monitor middleware will manage connectivity to the database much more effectively than a two-tier model. A two-tier implementation maintains one database connection for each desktop, but the three-tier model lets the TP monitor manage the desktop connectivity to a scalable set of application servers. The number of database connections that are instantiated will be the number of concurrently executing transactions to be supported (typically, at least a factor of 10 fewer database connections than in a two-tier implementation).

The three-tier model with a middleware architecture provided by the framework of a TP monitor also allows for much better availability characteristics than a two-tier implementation. A good TP monitor implementation can provide high availability via automatic failover if a database instance or application server experiences an outage.

OLTP on Shared-Nothing Database Products

The shared-nothing database products are not hampered by the block-ping problem because only one database instance is allowed to cache a particular data block into its database buffer cache. Thus you don't need to incur lock manager overhead or block-ping penalties in a shared-nothing database implementation for OLTP. However, the shared-nothing model introduces a different issue for high-end OLTP implementations for most products on the marketplace today. As a result of the data partitioning that is enforced with the shared-nothing model, most vendor implementations end up with database indexing structures that are local to each partition. That is, the index structures for the partitioned tables are partitioned according to how the rows of the table are partitioned, regardless of the index columns or their values. In other words, the indices that point to rows within a partition of a table coexist under the management of the same database instance. This approach provides simplicity in the underlying implementation of the database package, because a database instance can manage row and index maintenance locally without coordination with other database instances.

However, the local indexing approach also leads to scalability problems for high-volume OLTP in a shared-nothing database environment. Almost all OLTP transactions will be executed using very selective indexed access into database tables (the purpose of doing so is to deliver acceptable response times). The problem in shared-nothing database implementations with local indexing arises when a transaction is issued for indexed access against a partitioned table where the index column is not the column that was used for partitioning the table. In such a case, the database has no way of knowing which partition will contain the desired row or rows -- and so the transaction is broadcast-issued to all partitions simultaneously. Each partition can then use its local index structure to efficiently retrieve matching rows or to determine that no row with the desired qualification exists.

The problem with the broadcast issue of a transaction is that this approach is fundamentally nonscalable for high-volume OLTP in a multi-instance database implementation. Scalability implies that by adding more hardware resources and additional database instances, we should be able to increase transaction volumes and throughput in a high-end OLTP system. However, if all database instances must participate in all transaction executions because of the broadcast issue of transactions, we have lost all semblance of scalability. Even though the execution of a transaction may be efficient using local indices within the domain of each database instance, once any instance in the system reaches its saturation point, there is no way to increase transaction volumes by adding more instances; any increase in transaction volume must be handled by all instances (because all transactions are broadcast to all instances).

Note that if the desired index path happens to coincide with the partitioning column for a table (in either hash or key-range partitioning schemes) then a transaction does not necessarily need to be broadcast to all database instances. If the access path is based upon the partitioning column for a table, then there can be only one table partition in which the desired row(s) would be located. Thus the transaction could be properly executed without necessitating a broadcast issue by routing it only to the database instance corresponding to the table partition to which any matching rows must be present (if any such rows exist). Scalability for transaction executions that make use of the partitioning key for table access is quite good because broadcasts can be avoided completely. However, it is very difficult to come up with a database design that accommodates this approach for complex applications deployments. The problem is that many tables will need to support more than one indexed access path in real-world applications.

For example, an account table may very effectively be partitioned by an account number (typically a primary key) to allow scalable OLTP workloads to be implemented when retrieving and updating accounts based on an account number access with efficient local indexing. On the other hand, accounts will very often be accessed via customer number as a foreign key with very high selectivity when accessed via an index. Unfortunately, however, the customer number is not the partitioning key on the table, so the transaction will need to be broadcast to all instances in a multi-instance shared-nothing database implementation for most of the products currently in the marketplace. It is impossible to partition the account table on both the account number and customer number simultaneously, so one of the access paths will result in a fundamentally nonscalable transaction workload implementation. General-purpose workloads, with many access paths, make it impossible to partition tables so that all indexed access will be implemented with alignment to partitioning columns for a table.

The problem in the scalability of OLTP workloads described above is a direct result of the local indexing implementation undertaken by most shared-nothing database products. The local indexing approach works very well for most decision-support workloads because indices don't have nearly the selectivity in these environments -- and therefore the penalty for broadcasting queries to all instances isn't nearly so great as it is in an OLTP environment. In fact, if most instances are going to return at least one row anyway (as will usually be the case in DSS workloads), then a broadcast strategy is quite appropriate. For OLTP transactions with selective indexed access, the performance cost of broadcasting to all instances (in overall throughput) will be roughly proportional to the total number of instances in the database configuration. It is for this reason that most shared-nothing database implementations are focused on the DSS marketplace rather than on OLTP. Among the multi-instance implementations of a shared-nothing database architecture, only Tandem's Non-Stop SQL is able to demonstrate good scalability with high-volume OLTP workloads.

Intelligent Indexing

The key to scalability for OLTP workloads in a shared-nothing VLDB environment with multiple instances comes from indexing structures that are global in nature rather than local to each database instance. In other words, indices must provide global access to data blocks, no matter which instances own the indices and data blocks. For this to happen, the index structures must allow logical or physical pointers to data blocks under the management of instances other than the instance owning the index to the desired data block. Of course, to ensure scalability and manageability of the indices on very large tables, the indices themselves must be partitioned (typically using a key-range partitioning scheme). But rather than have the index partition aligned with the portion of a table to be indexed, the partitioning of a global index will be performed according to the value of the index columns.

Because the global index will be partitioned according to the value of the index columns, access via the index can be provided with explicit routing of the request to the database instance that manages the partition of the index corresponding to the requested value. The global index structure completely avoids the need to broadcast the transaction request to all database instances in the multi-instance implementation. (Note that in the case of an index range scan, multiple instances may be involved in the index access.) The instance that receives the request can efficiently find all database records matching the desired value through use of its partition of the index structure. The matching database records may then be retrieved from whichever database instances happen to have management ownership of the desired records. The partitioned index structure must possess the capability to manage pointers to database records that cross instance boundaries. These index pointers may be physical data block pointers that specify the instance to be targeted along with a data block address within the instance. Alternatively, a logical pointer can be stored that contains the primary key value of the record to be accessed (this technique works best when table partitioning is required to be on a table's primary key). Vendors must consider a variety of performance and maintenance tradeoffs when selecting implementation approaches to pursue within a database product.

There is, of course, an overhead for accessing an index in one instance and then going to a separate instance for the targeted database record. Yet, despite the incremental overhead, this approach is more scalable. As more transaction throughput is required, the architecture allows for the configuration of additional database instances with partitioning on both tables and indices in such a way that no bottlenecks are introduced, either in the form of broadcast transactions or as single points of contention. Only when accessing a table using indices that are not very selective will the cost of the multiple instance coordination for index and record access outweigh the cost of broadcast transactions in a VLDB implementation that has many database instances. In the case of a lack in primary-key type of selectivity for an index -- as is typically found in DSS workloads -- the use of local indexing will often be desirable.

In the future, we believe that both local and global indices will become available for high-end database products. It is likely that DBAs will initially be required to specify what type of structure (local or global) is to be constructed for the columns to be indexed. In the future, it is likely that the DBMS engine itself could determine the most appropriate structure based on the content of the columns to be indexed. Todd Walter, Chief Technical Officer for the Teradata database at NCR, emphasizes that any global indexing scheme must be partitioned in order to avoid single points of contention. As the Teradata database evolves to support a global indexing capability, Walter says that particular attention will be paid to ensuring efficiency and scalability in index maintenance functions for VLDB tables.

Optimizer Quality

Although there are many technical challenges in implementing a VLDB data warehouse system, a recurring theme that tends to differentiate the successful delivery of performance across sites is the quality of the database optimizer deployed. Although it may be possible to successfully deploy an OLTP VLDB implementation using rule-based or naive cost-based optimization strategies, the complexity and unpredictability of DSS queries makes the quality of the cost-based optimizer of the database engine a critical differentiator among products.

Over the last few years, especially as the market trend for data warehousing has really caught on, database vendors have put a great deal of focus on optimizer development. Oracle introduced its cost-based optimizer in Oracle 7.0 and is phasing out the use of rule-based optimization in its product. Gary Hallmark, consulting member of technical staff who focuses primarily on data warehouse issues at Oracle, sees the cost-based optimizer as a key component of Oracle's data warehouse solution with specific features that allow for exploitation of star schema designs, bitmapped indices, and parallel-processing capabilities available with the Oracle Parallel Query option. Hallmark emphasizes the importance of a number of optimization improvements introduced with Oracle 7.3, including "parallel aware'' optimization strategies that enable the optimizer to use parallelism in its execution plans to trade system throughput for better response times on individual queries. Oracle 7.3 also introduced histogrammed statistics for use by the optimizer for better plan generation when skewed data is present within large database tables. The quality of plans generated by Oracle's cost-based optimizer have increased dramatically between Oracle 7.0 and the current 7.3 version of the product, and even further enhancements are expected with the release of Oracle8 later this year.

IBM expects to integrate its advanced research in cost optimization strategies coming out of the Starburst project (undertaken in IBM's Almaden research laboratories in California) into both the MVS and Unix implementations of the DB2 database product. Gilles Fecteau, architect of DB2/6000 Parallel Edition, indicates that customers for this high-end database focused on DSS workloads should see the benefits of the Starburst research in the Common Server implementation of DB2 as early as the first half of 1997. Because the research prototype for IBM's optimizer technology was developed in C++, it will be a much faster path for integrating the advancements into DB2's Common Server database engine (which is also implemented in C++) than will be the case for DB2 on MVS (which is implemented in assembler and PL/X-390).

The Informix XPS database implementation has set a very high performance bar for its competitors with the efficient implementation of a hash join plan that has been made available to the optimizer with pipelining to facilitate parallel execution along multiple dimensions. Oracle has since introduced hash join in its 7.3 database release, and other vendors who have not already implemented hash joins are likely to do so soon. Although hash joins are not relevant to OLTP workloads, for decision support this strategy can deliver remarkable performance advantage over sort-merge joins. The gold standard that all cost-based optimizers are striving for in executing DSS workloads is relative to the Teradata optimizer. The Teradata database implemented a cost-based optimizer totally focused on DSS workloads from day one; it has over 10 years of maturity under its belt directed at this marketplace.

In the big picture, it is generally much more effective to rely on good optimizer strategies than to use the brute-force approach of throwing hardware at a performance problem. With the widespread use of parallel hardware and database platforms, there is often a temptation to use parallel hardware scalability to "hide" performance problems in a data warehouse implementation. However, this solution path can be very expensive. Doubling a hardware configuration may be able to cut response times in half, but good optimizer choices will often result in order-of-magnitude performance benefits. Red Brick Systems Inc.'s DBMS, for example, completely de-emphasizes parallelism in its deployment. Red Brick uses optimization strategies specifically targeted at star schema database designs that are well suited for certain classes of DSS workloads. By making clever use of indexing technologies and optimizing for specific DSS query structures, Red Brick is able to deliver order-of-magnitude better performance characteristics for many decision-support workloads than more traditional relational implementations. The Sybase IQ product has taken a similar approach, with optimization for DSS workloads using a combination of bit indexing and clever table partitioning to deliver high performance. However, for very large data warehouse implementations, Sybase typically recommends deployment of its Sybase MPP product in order to receive the benefits of scalable parallel processing.

The reality is that most data warehouse implementations have not yet run into the technical challenges related to deployment of appropriate optimization strategies for end-user workloads. The critical issues early on in these projects, in fact, have very little to do with database technology per se. The major obstacles have more to do with understanding business requirements and managing end-user expectations than any particular technology issue (in contrast to OLTP implementations, which are primarily focused on the technical challenges related to high availability and performance). Second, data quality is a huge issue in data warehouse implementations because end users are accessing different parts of the data than is typically focused on in OLTP systems. The VLDB marketplace for data warehouse systems still has a lot of maturation ahead of it.

Taming Terabytes

Understanding why modern databases grow so large and so quickly helps you forecast what can happen in your company. When you find yourself grappling with a VLDB, you must carefully assess and implement the strategies, techniques, and technologies that can make your VLDB more manageable.


Stephen Brobst is a founder and managing partner at Strategic Technologies & Systems, a business technology consultancy based in Boston, Massachusetts, and is chief technology officer at Tanning Technology Corp. (based in Denver, Colorado). Owen Robertson is a senior DBA at Tanning Technology Corp. You can email Stephen at sabrobst@jj.lcs.mit.edu and Owen at oroberts@tanning.com.


* Cray Research Inc. (a Silicon Graphics Company), Eagan, MN; 612-452-6650 or fax 612-683-7199; www.cray.com.
* IBM Corp., Armonk, NY; 800-426-3333 or 914-765-1900; www.ibm.com.
* Informix Software Inc., Menlo Park, CA; 800-331-1763, 415-926-6300, or fax 415-926-6593; www.informix.com.
* NCR Corp., Dayton, OH; 800-225-5627 or 513-445-5000; www.ncr.com.
* Oracle Corp., Redwood Shores, CA; 800-672-2537, 415-506-7000, or fax 415-506-7200; www.oracle.com.
* Sybase Inc., Emeryville, CA; 800-879-2273, 510-922-3500, or fax 510-922-9441; www.sybase.com.
* Tandem Computers Inc., Cupertino, CA; 408- 285 6000 or fax 408-285 8611; www.tandem.com.




Subscribe to DBMS and Internet Systems -- It's free for qualified readers in the United States
March 1997 Table of Contents | Other Contents | Article Index | Search | Site Index | Home

DBMS and Internet Systems (http://www.dbmsmag.com)
Copyright © 1997 Miller Freeman, Inc. ALL RIGHTS RESERVED
Redistribution without permission is prohibited.
Please send questions or comments to dbms@mfi.com
Updated Tuesday, February 11, 1997.