DBMS, January 1996
DBMS Online: SQL for Smarties By Joe Celko

A Little R & R

Joe Spends His Honeymoon Contemplating the SQL Universe

The SQL Explorer has been traveling a little too much lately and would love to stop and rest. I got married on Labor Day weekend in September, and immediately made my new wife very unhappy with me. My travel schedule is keeping me on the road during the week, and when I am home on the weekends she is at work, pulling two 12-hour shifts.

Immediately after the wedding, I had to go to Macon, Georgia for two weeks to look at a data warehousing project. Macon has no good hotels, and the town is the buckle of the Baptist Bible Belt. Hooters is a family restaurant, complete with crying babies. I got so desperate I even looked up "Escort Services" in the local yellow pages, but the only listing was a company that escorts extra-wide trucks on the interstate highways.

Software Development East '95

When I got back from Macon, my wife and I immediately went to Washington, D.C. I was speaking at Software Development East (October 3-5, 1995) and she wanted to see her father for the first time in more than 10 years. Her father was supposed to give us a ride from Dulles International Airport, but this didn't happen, so I missed my book signing at the Computer Literacy Bookstore at the show. (However, even though I wasn't there, the bookstore sold more than 40 copies of my book, SQL for Smarties.) Software Development is one of the last of the hardcore dweeb conferences, complete with "real" programmers and techies from all over the world. They talk shop, not management.

At the end of the trip, we connected up with my father-in-law and he managed to miss the Dulles International Airport turnoff three times when he gave us a ride back. This was our honeymoon.

DCI Data Warehousing Conference

When we got home from D.C., I had to take care of my CompuServe mail--during the month of October I ran a study group in the DBMS forum, based on my book. Then I got one day at the office to clear up 28 voice mails before heading out to Phoenix for three days for the DCI Data Warehousing Conference. The conference had approximately 900 attendees, more than double last year's attendance.

The META Group announced at the conference that it is involved with the newly formed Metadata Council. This is a vendor consortium composed of Arbor Software, Business Objects, Cognos, Evolutionary Technology Inc., Platinum Technologies, and Texas Instruments Software, with the purpose of standardizing the exchange of metadata with a common API and metadata model. The Metadata Council will oversee the Metadata Coalition, which will be made up of vendors and end users. The Council's membership will rotate. If you are interested in membership, you can contact either Dr. Katherine Hammer (Evolutionary Technologies Inc., 512-327-6994) or Patricia Nghiem (Business Objects, 408-973-9300).

I got back on Friday, October 13, and spoke at the Southeastern Ingres Users Group meeting. Of course, I had picked up a lung infection in Phoenix and was running a fever. The next week I spoke to the Atlanta chapter of Association for Systems Management. I am the darling of local users group programs, known for my generous nature and helpful support of all trade associations. Oh, did I mention that I am shamelessly promoting my two books (SQL for Smarties and Instant SQL) at every user group or trade show I can find?

Shrink-Wrapped Middleware

After I got back from the Data Warehousing Conference, I had enough time to wash my clothes, read my mail, and repack my bags for another trip. This time it was back to Minneapolis to do some follow up on Ensodex (Shoreview, Minn., 612-638-0721, http://www.ensodex.com); see my column, "The World of Software," June 1995) and its product. John Paulson and Harold Slawik picked me up at the airport and took me over to their downtown offices. Since my last visit, the name of the product has become HotSockets, and it is now ready to go to market.

Ensodex provides a set of drivers and middleware that lets you connect from any client to any server via ODBC. The client can be either Unix, Windows NT, Windows 95, or any product that supports an ODBC driver. While we were waiting for the graphic artist to show up with the final packaging from the printer, I got to play with the product using a PC-based copy of Microsoft Query. The connection was from the PC to an Oracle database running on a Sun machine at the Ensodex development office. I connected to the database (located somewhere in New York state) via IBMnet and the Internet, and then back to Minneapolis -- not over local telephone lines. The response time was as good as I normally get from my admittedly old machine with a local database at home. The speed is limited by the modems, the quality of the lines, and the I/O time at the front end.

The bottom line is that Internet ODBC is now real and available at commodity prices. This means that it is practical to use the Internet as your company network. When we got back to the development office, I broke open a shrink-wrapped package (containing diskettes and a manual) and performed an installation first on Windows NT and then on Unix. I encountered no problems, other than some forgotten Unix commands. You simply call a set-up program, then go to a GUI form screen to configure your data sources. HotSockets works with Intersolv, Visigenics, Windows NT, and Windows 95 ODBC drivers.

Also, HotSockets has a block-fetch feature that you can turn on and off during a session. The typical mode is to pass one result row at a time over the wire. However, with a block fetch, the result set is sent in large blocks. This is a faster approach, because it greatly reduces network traffic. The trade-off is that you cannot use SQLGetData(), SQLSetPos(), or positioned UPDATEs and DELETEs. In my June column, I mentioned that HotSockets was running at 200 rows per second. With block fetch, the same data (25,000 + rows of 70 characters) ran at better than 1300 rows per second.

The design philosophy is "save the server"-- that is, if one client fails, the server remains running for the other clients. There are also error and communication logs, so you have some idea of what happened in the event of a failure.

One unexpected feature of the product is that you can use it as a gateway by connecting client A on machine A to a HotSockets driver on machine B, which connects to a HotSockets Driver on machine B that, in turn, connects to machine C. (Imagine extension cords strung together.) The user of machine A is aware only that the machine is running a little slower than it should. The user does not see any of the connections until an error message appears that shows the chain of connections between the client and the database on the back end.

I want to see what other people do with this product. It needs a security package for heavy commercial applications on the Internet, but these types of packages are available off the shelf. Data warehousing applications can get online access to production databases, and replication services are suddenly easy to set up using script languages that can handle ODBC. Right now, Ensodex is working on a data source name service that will provide clients with the names of their ODBC data sources on the Internet. Ideally, a company would distribute a diskette with HotSockets, and the installation would create a file of all the sources to which the users have access.

ANSI X3H2 Meeting in Jersey

I left Minneapolis for Long Branch, N.J. and another ANSI X3H2 Database Standards Committee meeting (October 23-26). Luckily, the Hampton Inn in Minneapolis had guest washing machines.

Long Branch is a town on the Jersey shore that has one really nice resort hotel, where we stayed, an excellent crab house restaurant, and nothing else. The town consists of houses and businesses that are boarded up and covered with For Sale signs. I was told that the beach area burned down in a storm a few years ago and was never built back because of political problems.

The X3H2 Committee is holding at 26 members (Borland dropped out and UniSQL returned in October). This meeting was only two days long, thanks to a short agenda.

Our beloved chairman, Don Deutsch, chided us for not making better progress on advancing SQL3. He stated his fear that it could become a moot point as the market overtakes the standards committees and provides de facto standards for object-oriented databases and the other areas we are trying to include in SQL3.

There is good evidence of this trend. The ANSI Z39 Standards Committee has been invited to speak to us about its Z39.50 text-retrieval language and other textbase problems. The Open Geographical Information Systems (OGIS) vendor consortium has overlapping membership with the ANSI X3L1 Committee, and it might define its own GIS query language shortly. The X/Open consortium is continuing work on its Call Level Interface (CLI), even as ANSI and ISO advance the SQL/CLI to international status (by the time you read this column in January, ISO could have already approved it).

ANSI X3H2 makes good use of email to pass around documents and meeting information, but this is not the case in ISO. Instead, it makes a lot of money selling hard-copy standards documents, so you can understand its lack of desire to publish these documents on the Web. However, this fear extends to internal committee use, too. The July 1995 meeting of ISO/IEC JTC 1/SC24 approved a trial of electronic documents for the period of September 1995 to May 1996. The U.K. is against electronic documents, while Germany and Italy want to study the use of diskettes and "sneaker net" as a first step.

The one interesting new idea I got at the meeting was "logical locking"; that is, instead of putting physical locks on rows or pages of storage, you route all database requests to a referee who decides what you are allowed to update. Then, if one user wants to update all the rows where (age>21), and another user wants to update all the rows where (hair = 'Blonde'), the referee must determine if there is any overlap in the result sets, and exclude one or both transactions from the database. This is another way to free the logical model from the physical model.

Sybase Makes Moves

Phil Shaw announced in an October email broadcast that he has changed his affiliation to Sybase (Emeryville, Calif.). Phil formerly worked for IBM as its point man on ANSI X3H2, and then moved over to do the same job at Oracle when IBM "downsized." Phil is one of the oldest and most technically competent members of X3H2. Sybase has already hired two former DEC representatives as well as Chairman Don Deutsch.

I find Sybase's move to buy up X3H2 people interesting because Sybase is at a critical point. On the technical side, its SQL Server product has been noted for poor scalability, bugs, and a non-standard, non-portable pseudo-SQL language. On the financial side, the company lost $17.4 million the first quarter of 1995, and fell short of its 1994 earnings the rest of year.

To solve some of the technical problems, Sybase is pushing its new SQL Server 11 and a bitmap indexing method called Sybase IQ, and promises to ship products that support data warehousing. (See Client/Server Connection.) Sybase IQ is based on the Expressway technology, which Sybase acquired in early 1995. Other improvements should bring the company up to the performance levels that Informix and Oracle have already achieved in the last few years.

Sybase has also added better technical people. The purchase of Powersoft (and hence Watcom) gave the company some of the best compiler writers and SQL designers in the trade. Sybase is also buying up more individual X3H2 members than Computer Associates did when it was acquiring entire companies.

However, I am worried by Sybase's technical move to rename the next version of Watcom SQL (which should be version 5.0) to Sybase SQL Anywhere. This product will include a switch to run the old T-SQL language. Wouldn't the time have been better spent writing conversion tools to bring the victims of the old Sybase products up to current standards? How much do you have to cripple the perfectly good Watcom SQL engine to include mutant behavior? This is as dumb as producing an object-oriented autocoder to enable you to maintain backward compatibility.

If Sybase cannot get its act together, I will have to agree with Donald Feinberg of the Gartner Group, who said recently, "On a scale of one to 10, 10 being the ultimate RDBMS, Sybase needs a 20 here" (Computerworld, October 16, 1995, page 1). However, I think that right now Sybase could beat out or hold its own against Microsoft in this market niche. Microsoft suffers from being locked into NT as its only operating system at a time when more shops are looking at Unix systems. Microsoft's desktop database products, Access and FoxPro, are not compatible with SQL Server or with each other, they are not near any standard, and they are not designed for client/server applications. A reasonably priced suite of integrated tools from one vendor that can operate on multiple platforms would have a place here.

This Month's Puzzle

Mark Fronteraat of LanSoft Inc. (Miami) posted this month's problem in the CompuServe Watcom forum in September 1995. He has budgeting information that consists of the following three tables: the items to be paid for, the estimated amounts to be spent on them, and the actual amounts spent on them. Some items are covered by more than one check, and sometimes one check covers several items.

Items

item_nodescription
=====================
10'Item 10'
20 'Item 20'
30'Item 30'
40'Item 40'
50'Item 50'

Actuals

item_no actual_amountcheck_no
=================================
10 300.00'1111'
20 325.00'2222'
20 100.00'3333'
30 525.00'1111'

Estimates

item_no estimated_amount
=========================
10 300.00
10 50.00
20 325.00
20 110.00
40 25.00

Your problem is to get this output from a single query :

Results

item_nodescriptiontot_act tot_estcheck_no
===================================================
10'item 10'300.00 350.00 ' 1111'
20'Item 20'425.00 435.00 'Mixed'
30'Item 30'525.00 NULL '1111'
40'Item 40'525.00 NULL NULL

Item 50 from the Items table is not to be shown because there was no record for it in either the Actuals or the Estimates table. The column tot_act is the total of actual amounts for that item; the column tot_est is the total of estimated amounts for that item. (see Puzzle answer.)


Joe Celko is an Atlanta-based guru with Northern Lights Software Ltd. and a member of the ANSI X3H2 Database Standards Committee. He is also the author of two books on SQL: SQL For Smarties (Morgan-Kaufmann, 1995) and Instant SQL Programming (Wrox Press, 1995). You can contact Joe via CompuServe at 71062,1056 or via email at 71062.1056@compuserve.com.

PUZZLE ANSWER

I think that this schema needs some work, but you can do this with scalar subqueries and some tricky code.

SELECT item_no, description,
(SELECT SUM (actual_amount)
FROM Actuals
WHERE Items.item_no = Actuals.item_no) AS tot_act,
(SELECT SUM (estimated_amount)
FROM Estimates
WHERE Items.item_no = Estimates.item_no) AS tot_est,
(SELECT CASE
WHEN COUNT (*) = 1
THEN MAX (check_no)
ELSE 'Mixed' END
FROM Actuals
WHERE Items.item_no = Actuals.item_no
GROUP BY item_no) AS check_no
FROM Items
WHERE tot_act IS NOT NULL
OR tot_est IS NOT NULL;

The trick is in the scalar subqueries. The first two subqueries calculate the total actual amounts and the total estimated amounts as if they were part of a GROUP BY and LEFT OUTER JOIN. I will offer a prize for a solution that uses a GROUP BY and LEFT OUTER JOIN and is still only one query (no VIEWs).

The final subquery is trickier. The query finds all of the Actuals that are associated with the item under consideration in the result table and makes a group from them. If the group is empty (no checks issued), the subquery returns a single NULL, and you display that NULL. If the group has one check in it, the CASE expression will return that single check number. The MAX() function is a safety check to guarantee that you have a scalar result from the subquery and you might not need it in all SQL-92 implementations. If there is more than one check actually issued on the item, then the COUNT(*) is greater than one and you get the string 'Mixed' instead of a string that represents the unique check number.

--Joe Celko


Subscribe to DBMS and Internet Systems -- It's free for qualified readers in the United States
January 1996 Table of Contents | Other Contents | Article Index | Search | Site Index | Home

DBMS and Internet Systems (http://www.dbmsmag.com)
Copyright © 1996 Miller Freeman, Inc. ALL RIGHTS RESERVED
Redistribution without permission is prohibited.
Please send questions or comments to dbms@mfi.com
Updated Sunday, December 1, 1996