DBMS, February 1998
DBMS Online: Server Side By Martin Rennhackkamp

Managing Dynamic Queries

The Teleran System helps keep runaway queries at bay.


The following scene should be familiar ı especially if youıre the DBA of a large organization. There are just over 400 users on the system, response times are subsecond, I/O is well balanced across the disks, CPU usage is at 27 percent, and memory utilization is steady at 55 percent. The pace has slowed down a bit; after all, it is 3:30 on a Friday afternoon and the users are thinking of the weekendıs activities. Your discussion with the new apprentice DBA has turned to the nightıs hockey game, the TV schedule, and his date with the new neighbor across the hall. The system is purring as smoothly as a Rolls Royce. Heck, they donıt call you "The Doc" for nothing.

Then suddenly it happens. Three system alarms go off in unison, CPU usage climbs to a full 100 percent, memory utilization shoots through the ceiling, the system starts swapping, and the disks creak like a rundown bus screaming around the corners of a Caribbean island mountain pass. Both your phones are screeching when the system manager storms into your office. No "Hi!," no "Howıs the system?," just, "The users are up in arms ı the data warehouse system is down!" Pandemonium has struck. "You better grab all your tools, Doc, they need information for Mondayıs board meeting. Besides, Iıve got tickets for the early show." The apprentice DBAıs eyes are as big as saucers.

In an instant your daydream has turned into a DBAıs worst nightmare. The cause ı you recognize it as it happens ı is "the query from Hell." Fighting the system to get resources, you fire up the DBMS monitor. After an hour of slow response times and irate users on the line, you detect that young Jason Lawrence, the new account manager in the client-care section, has issued a Cartesian product query over the two biggest tables in the data warehouse. You run to the next-door office, call Jason, and tell him to abort his query ı which he does ı but the DBMS is still thrashing as it rolls back. You have two options: You can kill the server and wait for it to recover when it starts up again, or you can let it ride out the storm. Experience tells you to go for the second option, but the system manager is at you again: "Do something, man!"

Friday evening at 8:30 p.m., the system has recovered and response is back to normal. The system manager has long left for the early show, and the apprentice DBA has accepted the fact that he had to cancel his date. He asks innocently, "It seems some people just donıt think about their queries. But couldnıt we have stopped Lawrenceıs query before it started? When we used OpenIngres in college, we had to say ıgrant query_io_limit x to username,ı especially when we were experimenting with awkward queries on large tables." "Yeah," you sigh, "They donıt know about queries. They know what data they want but not how to formulate it. And then they complain if it takes too long. If only this brain-dead DBMS we use for our data warehouse had such query management facilities. Very few DBMSs have such controls."

Some DBMSs give you a little control over runaway queries. However, these are only based on actual measurements while the query is running (which is too late) or at best, some estimations made by the query optimizer. Wouldnıt it be useful if the optimizer could be more intelligent, if it could apply some knowledge when analyzing the queries it has to process? It knows all about the table structures and indexes ı it could even give feedback to the user. It must do this, however, without impacting the query analysis and processing throughput.

The Teleran System

Enter Teleran Technologies L.P., a privately owned company based in Roseland, New Jersey. The company has developed an adaptive query management system called the Teleran System. The Teleran System provides an intelligent framework for monitoring and optimizing database queries, which is particularly useful in a data warehouse environment. Its main functions are to monitor the queries accessing the database, schedule and control the queriesı execution, suggest better queries, and in general, control the resource utilization of the queries in real time.

The Teleran System is deployed as a middle-tier pass-through server between the client application and the back-end database server, as illustrated in Figure 1. To the client application, it poses as a database server. You configure your query and reporting package (or your in-house application) to connect to the Teleran server as if it were the database server. The Teleran server accepts ODBC and native Sybase connections and queries. The Teleran server itself is the client of the database server. It connects to the database server, submits the queries, and receives the results from the database. The Teleran server, in turn, returns the results of the queries to the client application exactly as a database server would do.

Using a three-tier architecture is not only fashionable and in line with current network architectures; in this context it serves a useful purpose. By using a middleware architecture, the query management system does not impact the back-end server or the client workstation. If the query management system is placed on the back-end server, it can influence the system whose performance it is trying to monitor. If it is placed on the client workstations, it can add overhead to the clients, it can be difficult to transfer the measured data to a central location, and installation on a few thousand client platforms can be a logistical nightmare.

The Teleran System employs an administration database in which it records details about the queries, the structures and contents of the target databases, and statistics about the result sets returned to the client applications. It stores the database structure definitions in a metadata format independent of the target databaseıs system catalogs. It also uses the administration database to store the rules of its query management expert system. The rules are stored in a textual format in relational tables. The administration database can be located on the same hardware platform as the Teleran server, or it can be located on another platform, such as one of the target database servers, or on its own back-end database server. The Teleran System has an administration application for setting up the administration database, managing the query control rules, and obtaining various statistics on query throughput and resource utilization.

The entire Teleran System is a Java-based server application. As such, it runs on any system with the Java Virtual Machine, with Windows NT and Unix (Sun, HP, DEC) listed as presently available. It can control queries issued to IBMıs DB2 5.0 for MVS; Oracle7 version 7.3 and Oracle8; Sybase 10, 11, and 11.5; and Sybase IQ 11.2 on these platforms, with support for Microsoft SQL Server 6.5 and 7, and Informix-OnLine XPS expected next. It can use any of these databases for its administration database, too. For example, with Teleranıs strong partnership with Sybase, users can control the usage of data warehouses implemented with Sybase IQ 11.2 in the context of Sybaseıs Adaptive Component Architecture. (I reviewed Sybase IQ in my August 1997 column, and I described the Sybase Adaptive Component Architecture briefly in my January 1998 column.)

Note that the Teleran Systemıs architecture supports multiple connections to different back-end databases. For example, you can use one Teleran server to manage queries against a Microsoft SQL Server database on an NT machine and a Sybase IQ database on a Unix platform.

Functions

The Teleran System performs query analysis and control functions. At start-up, it reads the database structures, rules, and statistics from the administration database into memory. During runtime, it unobtrusively monitors the query stream from the applications to the database server, and it intercepts high cost, nuisance, and insensible queries. When it receives a query from a client application, it parses the query, checks the database structures, rules, and statistics; analyzes the queryıs impact; and stores the query in the administration database before passing on the query to the back-end DBMS. It also monitors the data stream from the database back to the application. Upon receiving a result set from the database, it extracts the sizing details and stores them with the query in the administration database before passing the data back to the application.

You may wonder what the impact of the query analysis and control process is. The Teleran Systemıs query analyzer takes one to two seconds to analyze, store, and control a complex query. A complex query in Teleran terms is an extremely complex query ı with extensive joins, unions, where clauses, and particularly nested queries with exists, not exists, in, not in, and similar clauses. In a large data warehouse environment these queries typically take several minutes, if not hours, to execute. Thus the one- or two-second delay is negligible, especially if you consider the control and information you can gain from the process.

The two key components of the Teleran System are its real-time query governor and its adaptive rules engine. The query governor analyzes queries as it receives them in real time. Based on the estimated cost and resource consumption, it can pass a query to the database server for execution, terminate its submission immediately, or submit it at a time when the systemıs resource utilization is low. By performing the cost analysis proactively, the system can terminate or give warning of a harmful query before it is executed. This is very useful for avoiding runaway queries, as I sketched at the beginning of the column.

The self-adaptive rules engine of the Teleran System uses artificial intelligence techniques to regenerate the queries submitted by the users. It applies various rules to transform the usersı queries to an optimal SQL form. The administrator can specify query rules to control resource usage, specific user access, and query scheduling. During installation, and periodically after that, the system analyzes the database structure to obtain details about the tables, indexes, the data content, and its value distributions. An administrator can instruct the Teleran System to refresh its database definition information. From a history it keeps of previous requests and database usage patterns, the system can adjust its own adaptive query transformation rules automatically. This is quite innovative because most expert systems require user intervention, or in some cases even program changes, to modify the rulesı behavior patterns. This dynamic adjustment is very applicable in data warehouse environments, where the usersı data access requirements typically change over time and where some users want to investigate long-term trends, while others want to drill down for details. It also allows each system to evolve its rules according to its own site-specific data access patterns. The rules are stored in the administration database in an easily understood textual format. Administrators can verify and change the rules through a graphical user interface, as illustrated in Figure 2.

Based on its rules, the system analyzes all the usersı queries. Inefficient queries are returned to the user with instructions on how to write a better query. The instruction messages are configurable in the administration database. They are returned to the client application in the same format as a database error or warning message. This can teach unskilled power users how to compose complex but efficient queries.

A resourceful administrator can adapt the rules to perform security control functions. Administrators can prohibit users from viewing certain data based on their user identifications, group memberships, time of the day, or even on the expected number of rows returned by the query. For example, they can apply statistical database access controls to protect the privacy of individuals, or they can make it impossible to perform statistical deductions that could reveal the identity of individuals or groups. The administration tool ships with a so-called blocking wizard, which helps administrators set up the necessary rules.

From the history of requests, the Teleran System can give feedback on the database design. It can suggest indexes that should be created, columns that should be summarized, and derived calculations that should be stored as columns. All these suggestions can reduce resource usage considerably. For a novice DBA, or even for a skilled but busy DBA, these suggestions can be an invaluable source of information. The statistics can also indicate areas of the data warehouse that are seldom used and areas that are never used at all. With the threat of an ever-growing data warehouse, this particular information is useful to indicate parts of the data warehouse that can be archived or stored on cheaper devices with slower access times.

The system captures the details and resource utilization of the queries submitted and executed by the various users. It provides a set of reporting and analysis tools through which this information can be analyzed. The administration utility ships with a starter set of Crystal Reports on tables and columns accessed, database access patterns, and resource utilization. It also has a published set of statistics tables that you can access with any query and reporting tool to extract your own statistics. The available statistics are useful for resource utilization analysis, capacity planning, detecting usage patterns, and identifying and preventing potential bottlenecks.

You can use the resource utilization figures to drive a charge-back system off the Teleran System. The administrator can assign unit costs according to various criteria, such as rows accessed, CPU time consumed, elapsed time, and number of database accesses. The usage figures can be summarized per user, per group, or per scheduled time. The administrator can set up the groupings and schedules, and the groups can be copied from the users and groups in the target database. This information is useful if you want to sell your data warehouse services or if your IT services are treated like billable cost centers, as is the case in many organizations. With the available resource utilization information, the costs of using the data warehouse are no longer hidden in the global data warehouse management costs. The data warehouse resource utilization figures can, for example, be used to justify additional data warehouse extensions, additional hardware for the data warehouse, or proposed changes to the data warehouse structures.

Dead in its Tracks

The Teleran System is an unmatched tool for dynamic query management. It is useful in both the data warehouse and data mart environments. It provides DBAs and system and application administrators with analysis information, structural recommendations, and proactive query controls previously only dreamt of. Its innovative adaptive rules system adjusts itself to changing structures and dynamic usage patterns. Its state-of-the-art implementation is scalable and portable. It even teaches users to write better queries.

You can use the Teleran System to "audit" the queries accessing your database ı not only to give feedback to your DBAs but also to your end users and application developers. If I were an application developer, especially if my organization were developing packages for the open market, I would seriously think about using the Teleran System in-house for development and testing. You can detect and fix inefficient database structures, especially inefficient queries, before shipping the package to the masses. This would ensure that your package is running efficiently and it would save you from the embarrassment of having your clientsı users telling you that your package is generating lousy, resource-squandering queries.

So, a few beers later, the apprentice DBA chirps: "Heck Doc, this Teleran thing you want wouldıve stopped old Jasonıs query dead in its tracks!" You wanted to correct him that it is "Mr. Lawrence" to him, but he continues: "Doc, do you think the Teleran Systemıs rule engine can read the text of Joe Celkoıs SQL for Smarties?"



Figure 1.
The Teleran System is deployed as a middle-tier pass-through between the client application and the back-end database server.


Figure 2.
Administrators can verify and change the rules through a graphical user interface.


Vendor Information

Martin Rennhackkamp is the owner and principal consultant of The Data Base Approach, a corporation specializing in relational and distributed databases, based in Cape Town, South Africa. You can reach Martin via the Internet at mr@dba.co.za.
What did you think of this article? Send a letter to the editor.


Subscribe to DBMS -- It's free for qualified readers in the United States
February 1998 Table of Contents | Other Contents | Article Index | Search | Site Index | Home

DBMS(http://www.dbmsmag.com)
Copyright © 1998 Miller Freeman, Inc. ALL RIGHTS RESERVED
Redistribution without permission is prohibited.
Please send questions or comments to dbms@mfi.com
Updated January 12, 1998