
In my previous mini-series I investigated the mechanisms you can use to ensure that you have only correct data in your databases. Correct data is not of much use, however, if you cannot get to it in finite time. Your databases will only be useful if your users can get to the data quickly and easily. In this month's column I start a new mini-series focusing on performance. Performance management can be divided into three related disciplines:
performance estimation, performance measurement, and performance improvement. I will cover only the technical topics of performance measurement ("monitoring") and performance improvement ("tuning") and leave the highly theoretical topic of performance estimation for the academic journals. This month I investigate performance monitoring in general, including the performance monitoring facilities provided by five popular DBMSs. Next month I will look at some of the mechanisms available to improve performance.
Most databases - no matter how well-designed - are prone to bad performance at one stage or another. This happens when, as the saying goes, "entropy sets in." Entropy, a well-known concept encountered in physics, is described as the tendency toward chaos. Sometimes in database applications entropy doesn't "set" in - it actually "rushes" in. An example of rushing occurs when you keep inserting linear key values into a non-self-balancing tree structure: One branch of the tree keeps growing while the other branches stay the same or even shrink, resulting in a skewed tree structure and terrible performance. Other examples of rushing occur when hashed or clustered indexes get long overflow chains that increase locking contention; when there is an unplanned change in the transaction rates and volumes; or when there are changes to the information stored, large volumes of new information, new access paths to the stored data, or changes in the typical usage patterns.
Several other factors besides entropy can influence a database's performance significantly. These factors include how and where the DBMS is installed, how the DBMS server is configured, how and where the logging and locking functions are performed, and - most important - the designs of the databases and the database applications.
Although I discuss monitoring in the context of performance management, it is important to note that monitoring is also a good problem-avoidance mechanism. For example, by constantly monitoring disk usage, a wide-awake DBA can avoid the potential problems caused by a full disk by moving some data to an emptier disk before the disk fills completely.
IPM is used to monitor the DBMS server processes and the locking and logging systems. For each server process, IPM shows the user sessions making use of that process, including what queries they are performing, which facility in the DBMS the queries are being processed by, and other related transaction details. For each user, IPM shows how many locks are being used, which tables and resources are locked, and the locks for which the user is waiting. For the logging system, IPM shows how often the log file is being accessed, the transactions for which it is being used, and how it is drained by the journaling system.
Trace points are variables that can be set to print tracing information to the screen or to a log file. Two of the most useful trace points are "set qep" and "set qe90." The first trace point gives you the query execution plan (QEP) selected by the optimizer for each query, while the second compares the actual execution statistics with the estimated statistics on the query execution plan. There are literally hundreds of other trace points that can be used for performance monitoring. Some of these can even influence the way the statistics-based query optimizer makes its decisions.
The QEP used to analyze the performance of the following query is shown in Figure 1:
IMA provides a programmable way to monitor and manage CA-OpenIngres installations. With IMA you can use any SQL- or ODBC-based tool to view or update the monitoring information about the CA-OpenIngres servers in your installation. For example, from a normal application program, you can query which sessions are holding locks required by other sessions. You can monitor specific database metrics and depict the results as a graph to understand specific performance trends.
The InterBase Interactive SQL Interface has various useful settings for performance monitoring. The "set stats on" setting displays the server's memory utilization, the query's elapsed time, the buffer utilization, and the number of physical disk read and write operations. The "set plan on" setting displays how queries are physically executed, specifically showing index utilization. Listing 1 shows a query and its query plan.
Various objects (or so-called instances of objects) can be monitored. Information from various machines in the network can even be displayed simultaneously in one monitor session. This is shown in Figure 2, where the legend depicts which objects are being monitored.
DBAs can also use the SQL Enterprise Manager to monitor and manage any number of SQL Server installations on the network. Among other things, SQL Enterprise Manager can report the size of any database or device on the network. It can also schedule monitoring, management, or tuning tasks via the SQL Executive. For example, the Current Activity option in the SQL Enterprise Manager produces the screen shown in Figure 3. This screen displays a quick overview of the current activity in the system and can identify a problem. In this case, the user "Cor" is being blocked by the user "sa" on the dbo.CargoItems resource. You can inspect a particular job by double-clicking on it. You can use the options on the Current Activity Screen to view the same information from other angles; for example, the Detail Activity option gives the activity per process (thread) in the server in a tabular format.
Developers and DBAs can use the query analysis tools available in ISQL/W, the GUI-interactive SQL environment provided as part of the client SQL Server installation. This lets users test the syntax, semantics, and performance of their SQL queries. This test is performed by activating the Query Plan and Statistics I/O options before running the query. You can also choose to optimize the query without running it, which will let you inspect a large query plan without actually running it.
Dynamic performance views are a set of underlying "tables" maintained by the Oracle server. These tables are continuously updated while a database is open and in use; their contents relate primarily to performance. To access these tables, the DBA must log on with the user name "SYS." Although they appear to be regular database tables, these tables may never be updated or altered; they may only be queried using SQL select statements. They present information such as the objects in the database that are currently locked and the sessions that are accessing them, a list of all the data files making up the database, and a list of all the database objects cached in the library cache.
The SQL trace facility generates performance statistics on the SQL statements executed by the Oracle server. These statistics include parse, execute, and fetch counts, CPU and elapsed times, physical and logical reads, and the number of rows processed. When the SQL trace facility is enabled, the performance statistics for all of the SQL statements executed in a user session or against the instance are placed into a trace file.
The explain plan command displays the execution plan chosen by the Oracle server's optimizer for SQL select, update, insert, and delete statements. The execution plan depicts the sequence of operations on the database objects that Oracle performs. By examining execution plans, you can investigate index utilization. However, before you can issue the explain plan statement, you must create a table of a specific structure to hold its output. The execution plan is then inserted into the table when a corresponding SQL statement is executed. Although the execution plan is not shown graphically, the table format is useful for extracting and summarizing performance data using SQL.
SQLBase also has a large number of so-called "environment options" that can be activated or deactivated. Some of these can be useful for performance monitoring; for example, the planonly environment option can be used to display the execution strategy followed by SQLBase in an easy-to-read table format.
LISTING 1
--A query and its query plan generated by the InterBase Interactive SQL Interface.
Table of Contents - September 1996 | Home Page