
Imagine the following scenario. The DBAs at ACME Inc. acknowledge that their database is sluggish. In fact, it is so slow that the users queue at the coffee machine each time they submit a query or a request for a report. Then one fine day IT management cannot take the users' hints, suggestions, and complaints anymore, and voilà, it jumps to action. Two weeks later, a so-called "performance consultant" squeezes a day from his busy schedule and nonchalantly saunters past security into the machine room at 10 a.m., dressed in designer jeans and well-worn Nike sneakers, but nonetheless with a gold and silver Cartier on his wrist and the latest gleaming sports model idling in the car park. He takes a hurried look at the machine's configuration, mutters, "Hmm, thought so," and calmly asks for the passwords of the root and DBA accounts, as if he were asking the way to the men's room. He then effortlessly surfs his way through the online operational database, querying the system catalogs and muttering a few more "hmm"s while scribbling some illegible notes on the back of an old database definition printout. Then - what audacity - he asks, no, he demands a temporary directory on the operational machine. Using vi as if it were the latest and greatest Windows 95 GUI word processor, he quickly creates a few scripts in the temporary directory. Ignoring one DBA's open-mouthed stare, he pulls him closer, saying, "Give this config to your system manager, run this script now against all the databases, run that script tonight when the users sign off (it will run for six-and-a-half hours), and configure your DBMS server with these settings next time you bring it down." He's out of there at 3 p.m., tires screaming out of the car park, off to the nearest ski resort for some late afternoon exercise after a hard day's work. Miraculously, one week later, once the DBA has regained his breath and performed the prescribed tasks, the queue at the coffee machine is empty, the users are smiling, and the database is flying . . . until one day, six months later, when entropy sets in again.
Although slightly tongue-in-cheek, the scenario I just described is not so far-fetched. Perhaps the part about the consultant zooming off to a ski resort is a bit of an exaggeration -he probably zoomed off to his next appointment. Nonetheless, it is no wonder that some IT personnel consider database performance tuning more of an art than a science. (But then, if you think about it, even the great masters of the arts used different tools and techniques.) In last month's column I investigated database performance monitoring and the widely differing tools provided by the various DBMS vendors to monitor their databases' performance. This month I look at the techniques and facilities available in some of the popular DBMSs to improve performance.
There are many more parameters that you can specify and that are derived according to the specified parameters - I would need pages of text to explain them all. For the logging subsystem, you can specify whether dual logging is enabled, where the log file(s) must be situated (on a raw device or a regular file system), how big the log must be, and how often and in what size blocks the memory buffers must be written out to the log file. For the locking subsystem, you can specify how many locks are expected per transaction, when locks must be escalated from page-level locks to table-level locks, and more. The configuration of the DBMS server process and its various subsystems can have a great influence on the system's performance; for example, query-intensive applications such as warehouses perform well with large read-ahead buffers, while online updating OLTP applications perform better with more numerous, but smaller, log buffers and more write-behind threads.
For each database, you can specify where its data files, journal files, checkpoint files, and temporary work files should be placed. In general, a database performs better if its data, log, journal, and checkpoint files are spread over different disks to give a greater degree of parallel I/O.
For each database table, you can specify over which database locations it must be spread, whether the table must be journaled or not, the number of data pages that should be preallocated, and, when full, the number of data pages by which the table should be extended. You can specify the storage structure of each table as B-tree, hash, heap, or ISAM. For each storage structure, you can specify the type of compression (if any) as well as the filling patterns for the internal index pages and the actual data pages. You can specify similar implementation details for each index, except that an index may not have a heap structure. The choice and implementation details of a table's storage structure, as well as its corresponding indexes, can have a big influence on the performance of the queries accessing the table. For example, range queries perform better on B-tree and ISAM structures, and direct-hit keyed queries perform better on hash structures.
CA-OpenIngres is extremely tunable, from the DBMS server to the database implementation and even down to the individual tables and indexes. The default implementation options used by CA-OpenIngres usually give acceptable performance for medium-sized databases and tables. The extensive tuning options enable a skilled DBA (or a skilled performance consultant) to make the database applications perform extremely well. It is possible to tune a database to match exactly the queries performed by the applications. However, it is just as easy for an unskilled DBA to specify the wrong options, resulting in very poor performance. DBAs and performance consultants who have worked extensively with CA-OpenIngres are often frustrated at the limited tuning capabilities of some of the other DBMS products.
The OnLine Dynamic Server's memory configuration is dynamic and lets you configure the memory, CPU, and disk utilization. You can specify a default shared memory size and the size by which it must be incremented when required; you can also allocate and deallocate shared memory on the fly as well. You can configure how many data buffers must be used to cache the database pages in memory, as well as the sizes of the three logical log buffers. You can specify the number of virtual processes the server starts up initially - this number can also be increased, and in some cases decreased, while the system is running. You can also specify the processor affinity, which can bind the virtual processes to specific processors. If you use the PDQ option, you can specify the percentage of CPU and memory that any query can occupy. This feature can be used to limit the impact of large CPU-intensive queries on fast, small OLTP transactions. In addition, you can configure how much read-ahead the asynchronous threads must perform for sequential table or index scans. You can also specify the maximum number of locks that can be held at any one time, up to a limit of 8,000,000.
You can spread a table or an index over one or more disks or over portions of a disk. A table is assigned to a so-called "dbspace" composed of one or more "chunks" that, in turn, correspond to all or part of a disk partition. Tables and indexes may also be fragmented horizontally across different disks, either by expression or on a round-robin basis. The degree of fragmentation that you choose can improve the performance of decision-support queries, and it also reduces the disk I/O contention. You can allocate large text or byte columns to a so-called "blobspace" on a separate device. However, you cannot influence the storage structures used for tables and indexes. Informix uses B+ tree-like structures for indexes, and these structures can be clustered or non-clustered. Informix can store its data on a file system or on a raw disk partition.
The Informix OnLine Server uses a cost-based optimizer, which takes the estimated I/O requests, the estimated amount of CPU processing, the resources required for sorting, and the statistics of the tables and indexes into account. It constructs all possible query plans with all possible join combinations using a bottom-up, breadth-first search strategy, after which it selects the query plan with the lowest estimated cost. You can influence the Informix query optimizer to an extent: Apart from specifying the statistics to be collected for each table, you can specify the preferred method for performing join operations, namely nested-loop joins or table-scan joins.
The Informix OnLine Dynamic Server is very configurable; in particular, you can change its memory utilization on the fly. The allocation of data to storage devices is also very configurable. Informix provides an Informix OnLine Dynamic Server Performance Guide to assist you through the various steps. However, Informix doesn't offer you any means to influence how the data is structured inside the tables and indexes.
An InterBase database consists of a single file per disk location (or directory) in which all of the database objects are stored. InterBase starts by filling the first location, then the second, and so on. For an InterBase database, you can specify the page size to be used for the database, the disk locations where the database files must be placed, and the size (in pages) of each file. Larger page sizes can improve the performance of larger databases, because the indexes will have shallower depths and large rows can be stored on single pages. However, for small OLTP-type databases, smaller page sizes may be more appropriate because less data is retrieved from the database each time. In addition, you can configure the "sweep interval" and the "forced database writes" per database. The sweep interval, if enabled, specifies how frequently InterBase should scan the database to remove old, unused versions of rows in the tables. (Note that InterBase uses a version-based optimistic concurrency control scheme instead of the locking-based concurrency control scheme used by most other products, which I described in my July column in DBMS.) Forced database writes ensure that changed data is immediately written out to disk -this process provides higher resilience to failures but can degrade performance.
For an InterBase table you can only define whether the data is stored in an internal InterBase table or in an external operating system file. A table in an external operating system file does not have all of the concurrency controls, journaling, and backup of an internal InterBase table, but it offers a convenient way to access legacy data or import new data into the database. The only tuning option for an InterBase index is whether it is stored in ascending or descending order.
InterBase is on the opposite side of the "tunability" scale from CA-OpenIngres. It follows a "black box" approach to database implementation and tuning. When I started working with InterBase in the late 1980s, the company seemed to say, "We (the DBMS developers) know much better than you (the DBA) how to tune your databases." In those somewhat early days of relational databases, this approach was highly acceptable. For unskilled DBAs, it was actually quite a relief. By the early 1990s, however, it was considered a limitation. Now, in the late 1990s, when "self-managing databases" are becoming popular, maybe the InterBase approach will become fashionable again, provided, of course, that InterBase's self-managed performance tuning can match the tuning performed by a skilled DBA using the other DBMSs.
Microsoft SQL Server is not very flexible with regard to the number of servers and how they interrelate, because their scheduling is controlled by the Windows NT operating system. However, Microsoft SQL Server still provides an impressive number of tunable server parameters. Some of these parameters can be changed while the server is running, but others only take effect once the server is restarted. The configurable server parameters include:
A table or index can be placed on a particular segment. A segment can span multiple data devices defined for the database. By allocating the tables and indexes to segments, you can optimally distribute the I/O to the database over a number of disks. You can specify a fill factor for each index to let the data fill up the index pages sparsely. You cannot specify the storage structure of a table or an index, but an index can be clustered with its table (in which case the index becomes the underlying storage structure for the whole table, and the table will be relocated to the segment specified for the index). A table can only have a single clustered index.
Tuning an Oracle Server instance entails setting its memory utilization, I/O activity, contention, and various other factors such as sort areas, free lists, and checkpoints. The memory utilization of the Oracle Server can be adjusted by specifying the amount of memory to allocate to the SQL parse areas, the shared pool, and the buffer cache. The I/O overhead of the Oracle Server can be reduced or parallelized by placing data files and redo files on different disks, by striping table data over different disks, and by separating tables and indexes on different disks. These actions are all done by specifying the allocations of so-called tablespaces to specific disk areas and then allocating the database objects to the appropriate tablespaces. Contention occurs when multiple processes require the same resource simultaneously. Contention can be reduced by adding dispatcher processes and/or shared server processes, by reducing the number of parallel query servers, or by adjusting the redo log buffer usage.
For each table, you can specify the percentage of free space to leave for future updates, the minimum percentage of used space to utilize in each data block, the tablespace where it should be allocated, its storage characteristics (its initial size and its growth pattern), its desired degree of parallelism, and the amount of caching to be used. For each index, you can specify the percentage of free space to leave for future updates, the tablespace where it should be allocated, its storage characteristics, and its desired degree of parallelism. Related tables can be stored together as a cluster in a pre-joined form. This configuration can result in excellent performance when the two tables are joined together often, but it's not so advantageous if the tables are frequently accessed separately. The cluster can also be stored in a hashed structure, which gives good performance at a cost of storage space when reasonably stable tables are often joined by exact-match queries.
You can select whether Oracle must optimize your queries using its older rule-based optimizer or its newer cost-based optimizer. The rule-based optimizer chooses an execution plan based on the available access paths and the ranks of these access paths in a published table. The cost-based optimizer chooses an execution plan based on the available access paths as well as on statistics in the data dictionary for the tables, clusters, and indexes. You can also add so-called "hints" (or optimization suggestions) to your queries. The goal of the cost-based optimizer is to increase throughput - that is, to reduce the resource usage necessary to process all of the accessed rows. It should be apparent by now that Oracle is also extremely tunable - so tunable, in fact, that an Oracle7 Server Tuning manual is included when you acquire Oracle!
For a SQLBase database, you can specify the database directory or, alternatively, the storage group to which the database should be allocated. A storage group is a named list of database areas, and a database area is a named area on a disk (such as a directory). In addition, you can specify the number of pages by which the database should be extended when it is full, the size of the log file, the offset of the transactions in the log file, the size by which the log file should be extended when necessary, the number of log files an active transaction may span, the frequency at which checkpoints should be taken, and the time operations may take to complete.
For each table, you can only specify the percentage of free space that should be left on each table page when it is first filled. The tables are automatically allocated to all of the storage areas where the database is implemented. For each index, you can specify whether it has a B-tree structure (default) or a hashed structure (explicitly specified). Note that you cannot update the key values of a hash structured index. You can specify whether the indexed values are stored in ascending or descending order, the percentage of free space to be left in the index when it is filled initially, and the expected logical or physical size of the index. Setting the index size too low can result in numerous overflow pages, which in turn can affect performance adversely.
SQLBase gives you a number of configuration options for the DBMS server, especially for individual databases. For individual tables and indexes, however, the choices are rather limited. The choice between B-tree and hash structures for indexes may be sufficient, but the default B-tree structures for tables seem too restricted. The main drawback for me is that I cannot allocate a table and its indexes to different database storage areas - often you want to enforce that a table and its indexes must be stored on different physical disks to allow parallel disk I/O.
Various shared resources can be tuned for the Sybase SQL Server, such as memory, locks, and CPUs. Through the Logical Memory Manager, you can configure a number of named caches, which are used for data and stored procedures. Each cache can be configured with several buffer pools of different sizes and with a least-recently used or a fetch-and-discard most-recently used paging strategy. This affects which data is cached and for how long. You can then bind specific tables and indexes to specific named caches. The indexes of a table can be bound to a different cache than the table itself. In addition, each user gets a private log cache so that the threads do not have to wait for each other for buffered access to the transaction log file. You can also specify the lock promotion threshold for the server, which specifies when page-level locks are escalated to table-level locks. Table-level locks are more efficient than page-level locks, but they obviously result in a much coarser exclusion granularity.
By using these features, you can configure Sybase SQL Server for mixed workload environments - for example, to run batch, OLTP, and DSS applications. Sybase SQL Server reads its setup from a configuration file. You can change or alternate the server's characteristics easily by using a different configuration file. Most of the configuration options described here for the Sybase SQL Server can also be specified per session, per individual query, or per database object.
A Sybase database can be spread over multiple physical storage devices. Each database has its own transaction log that can be configured with its own specific characteristics. The log should be placed on a separate device from the normal data tables and indexes, not just for improved performance but also for improved reliability.
When a Sybase table is created without indexes, it is stored as a heap structure. You can create one clustered index and many unclustered indexes per table. When you create a clustered index for a table, the table's data is physically stored in the order of the keys of the index. Sybase only supports B-tree structures for unclustered indexes. For each index, you can specify the fill factor and the number of rows it must store per page. You can spread individual tables and indexes over different physical storage devices - in fact, you can place the index pages for an unclustered table on separate physical devices from the data itself. You can also partition a table, which creates multiple "page chains" for the table. Partitioning reduces the access to the last page of the table and allows parallel I/O for bulk table operations.
Sybase has a cost-based query optimizer. It parses each SQL query and determines an optimal query execution plan according to its cache strategies, cache sizes, I/O block sizes, the available indexes, and the statistics it keeps on the table and its indexes. The query plan is a sequence of steps required to perform the query, such as table scans, index accesses, and temporary worktables. You can influence the query plan by specifying the order in which tables are joined, the number of tables that must be evaluated simultaneously in a join, the indexes that must be used to access a table, the I/O block size, and the caching strategy. However, these steps should only be performed by a skilled DBA, because inappropriate use of these techniques can result in very poor performance.
Of the DBMSs reviewed in this month's column, Sybase SQL Server has one of the most tunable servers - its highly configurable memory utilization is extremely powerful. Its server can be tuned to scale well to large systems and to exploit large memory configurations, SMP architectures, MPP systems, and mixed workload environments processed by a single server process per CPU. The extensive Sybase SQL Server Performance and Tuning Guide leads you through all of these options. On a table level, Sybase is tunable with respect to data and index placement, but not (as one would expect) with respect to the storage structures used for tables and indexes.
Remember that the most important factor in achieving good performance is effective, efficient database and application design. No matter how much tuning you apply to the DBMS, it cannot compensate for a bad design. Database performance specialists claim that as much as 85 percent of a database application's performance requirements must be addressed through proper database and application design and implementation.
The task of a DBA, therefore, is to fulfill the users' performance requirements with an efficient design and the correct use of the database tuning tools and techniques. But this is more difficult than it sounds - you've seen the widely diverse tuning facilities provided by the various DBMS products. A good DBA or performance consultant must know how to use the particular product's tuning tools and techniques to eliminate so-called "show stoppers" and ensure acceptable performance over the long term. In this knowledge - which tool or technique to use for each particular problem, without adversely affecting other performance settings - the art of tuning lives on. And we will still see the performance consultant zoom from one problem installation to another (and stopping at a few ski resorts en route) for quite a while.