Many readers have asked where I get my brilliant ideas for these columns. Well, I steal them, usually from smart people via email and the Web. For example, in a discussion of cursors and procedural code on CompuServe, I ventured the opinion that some tasks are procedural by nature and that you should use a cursor or stored procedure code to solve them. This is normally heresy for me because there is a formal proof that roughly says that anything you can write in a procedural language can be written declaratively. I do love holding the SQL banner high.
So the question is when to "go procedural" (as opposed to "go postal") with a problem. I would say that you need procedural code when:
There are probably other conditions, but I cannot think of them right now. If you can think of one, let me know.
The traveling salesman problem is probably the best known of my short list. Given a planar graph of cities, you must find the shortest route that makes a complete circuit through all the cities. Usually you get a set of city names and their x and y coordinates, so you can use a simple distance formula.
Knapsack problems are also called bin packing problems. You are given a container (the knapsack) and a set of items to put in it. The items have a value and weight that are not always related; for example, a brick is very heavy, but not much use on a camping trick; a match is very light and very valuable to a camper. The goal is to pack the knapsack to its full capacity with the most valuable subset of items.
The stable marriages problem is harder to explain. You have a set of n men and a set of n women. All the men have a preference scale for all the women that orders them from 1 to n without gaps or ties. The women have the same ranking system for the men.
The goal is to pair off the men and women into n marriages such that there is no pair in your final arrangement where Mr. X and Ms. Y are matched to each other when they both would rather be matched to someone else.
Consider my own stable marriage with Jackie. She wants to be married to Alec Baldwin, but he is happy with Kim Basinger. I want to marry Rachel Hunter, but she is happy with that aging rock star of hers. Therefore, Jackie and I have a stable marriage. (Actually, she says she will never leave me for another man, but if her lottery tickets pay off big, I'm history; being a mathematician, I can live with that.)
Remember that a stable marriage is not always a happy marriage. In fact, in this problem, while there is always at least one arrangement of stable marriages in any set, you most often find many different pairings that produce a set of stable marriages. Each set of marriages will tend to maximize either the happiness of the men or the women.
Let's show a small example in SQL with four couples:
CREATE TABLE Husbands
(man CHAR(5) NOT NULL,
woman CHAR(5) NOT NULL,
ranking INTEGER NOT NULL CHECK (ranking > 0),
PRIMARY KEY (man, woman));
INSERT INTO Husbands VALUES ('Abe', 'Joan', 1);
INSERT INTO Husbands VALUES ('Abe', 'Kathy', 2);
INSERT INTO Husbands VALUES ('Abe', 'Lynn', 3);
INSERT INTO Husbands VALUES ('Abe', 'Molly', 4);
INSERT INTO Husbands VALUES ('Bob', 'Joan', 3);
INSERT INTO Husbands VALUES ('Bob', 'Kathy', 4);
INSERT INTO Husbands VALUES ('Bob', 'Lynn', 2);
INSERT INTO Husbands VALUES ('Bob', 'Molly', 1);
INSERT INTO Husbands VALUES ('Chuck', 'Joan', 3);
INSERT INTO Husbands VALUES ('Chuck', 'Kathy', 4);
INSERT INTO Husbands VALUES ('Chuck', 'Lynn', 2);
INSERT INTO Husbands VALUES ('Chuck', 'Molly', 1);
INSERT INTO Husbands VALUES ('Dave', 'Joan', 2);
INSERT INTO Husbands VALUES ('Dave', 'Kathy', 1);
INSERT INTO Husbands VALUES ('Dave', 'Lynn', 3);
INSERT INTO Husbands VALUES ('Dave', 'Molly', 4);
CREATE TABLE Wives
(woman CHAR(5) NOT NULL,
man CHAR(5) NOT NULL,
ranking INTEGER NOT NULL CHECK (ranking > 0),
PRIMARY KEY (man, woman));
INSERT INTO Wives VALUES ('Joan', 'Abe', 1);
INSERT INTO Wives VALUES ('Joan', 'Bob', 3);
INSERT INTO Wives VALUES ('Joan', 'Chuck', 2);
INSERT INTO Wives VALUES ('Joan', 'Dave', 4);
INSERT INTO Wives VALUES ('Kathy', 'Abe', 4);
INSERT INTO Wives VALUES ('Kathy', 'Bob', 2);
INSERT INTO Wives VALUES ('Kathy', 'Chuck', 3);
INSERT INTO Wives VALUES ('Kathy', 'Dave', 1);
INSERT INTO Wives VALUES ('Lynn', 'Abe', 1);
INSERT INTO Wives VALUES ('Lynn', 'Bob', 3);
INSERT INTO Wives VALUES ('Lynn', 'Chuck', 4);
INSERT INTO Wives VALUES ('Lynn', 'Dave', 2);
INSERT INTO Wives VALUES ('Molly', 'Abe', 3);
INSERT INTO Wives VALUES ('Molly', 'Bob', 4);
INSERT INTO Wives VALUES ('Molly', 'Chuck', 1);
INSERT INTO Wives VALUES ('Molly', 'Dave', 2);
The pairing of:
('Abe', 'Lynn')
('Bob', 'Joan')
('Chuck', 'Molly')
('Dave', 'Kathy')
does not work. There is a "blocking pair" in ('Abe', 'Joan'). Abe is Joan's first choice and he is her first choice, as shown by the rows:
Wives ('Joan', 'Abe', 1)
Husbands ('Abe', 'Joan', 1)
but they are matched to others. A simple swap will give us a stable situation:
('Abe', 'Joan')
('Bob', 'Lynn')
('Chuck', 'Molly')
('Dave', 'Kathy')
If you use a backtracking algorithm, you do not have to generate all possible marriage sets. Once you found a blocking pair, you would never have to create it again. This is considerably faster than the n! rows that SQL must generate and filter. The only advantage with SQL -- and it is weak -- is that the algorithms for this problem will usually stop at the first success. They do not generate the full solution set, as SQL does.
If you are interested in this problem, you can get an example of a backtracking algorithm from Niklaus Wirth's Algorithms + Data Structures = Programs (Prentice-Hall, 1976) to compare to the SQL.
The American Mathematical Society just published Stable Marriage and Its Relation to Other Combinatorial Problems by Donald Knuth (translated from the original French by Martin Goldstein) as volume 10 of the series entitled CRM Proceedings and Lecture Notes (American Mathematical Society, 1996).
This booklet, which reproduces seven expository lectures given by the author in November 1975, is a gentle introduction to the analysis of algorithms using the beautiful theory of stable marriages as a vehicle to explain the basic paradigms of that subject.
This answer is from Richard Romley. Simply explained, it generates all possible marriages and filters out the failures. But there are some neat little optimizing tricks in the code. If you were paying attention in my May 1998 column, I gave you a neat way to generate permutations using an auxiliary table:
DROP TABLE Wife_perms;
CREATE TABLE Wife_perms
(wife CHAR(5) NOT NULL PRIMARY KEY,
tally INTEGER NOT NULL);
INSERT INTO Wife_perms VALUES ('Joan', 1);
INSERT INTO Wife_perms VALUES ('Kathy', 2);
INSERT INTO Wife_perms VALUES ('Lynn', 4);
INSERT INTO Wife_perms VALUES ('Molly', 8);
The query for finding stable marriages is:
SELECT W1.wife AS abe_wife, W2.wife AS bob_wife,
W3.wife AS check_wife, W4.wife AS dave_wife
FROM Wife_perms AS W1, Wife_perms AS W2,
Wife_perms AS W3, Wife_perms AS W4
WHERE (W1.tally + W2.tally + W3.tally + W4.tally) = 15
AND NOT EXISTS
(SELECT *>
FROM Husbands AS H1, Husbands AS H2,
Wives AS W1, Wives AS W2
WHERE H1.man = H2.man
AND H1.ranking > H2.ranking
AND (H1.man || H1.woman) IN
('Abe' || W1.wife, 'Bob' || W2.wife,
'Chuck' || W3.wife, 'Dave' || W4.wife)
AND H2.woman = W1.woman
AND W1.woman = W2.woman
AND W1.ranking > W2.ranking
AND (W1.man || W1.woman) IN
('Abe' || W1.wife, 'Bob' || W2.wife,
'Chuck' || W3.wife, 'Dave' || W4.wife));
The first predicate generates all the permutations of wives in the husband columns and the exists() checks for "blocking pairs" in the row. This query will take some time to run, especially on a small machine, and it will break down when you have a value of n too large for the permutation trick.
The other optimization trick is the list of concatenated strings to see that the blocking pairs is in the row that was just constructed. A shorter version of this trick is replacing:
SELECT * FROM Foo as F1, Bar as B1 WHERE F1.city = B1.city AND F1.state = B1.state;
with:
SELECT * FROM Foo as F1, Bar as B1 WHERE F1.city || F1.state = B1.city || B1.state;
to speed up a query.
There were only 4! (24) possible marriage collections, so this ran pretty fast, even on a small machine. Now extend the problem to a set of couples where n = 8; you now have 8! (40,320) possible marriage collections. And only a small number of the rows will be in the final answer set.
Your puzzle for this month is to do a better job than this. I have put the code for creating the tables for n = 8 and an answer on the DBMS Web site so you can check your solution.
DROP TABLE Husbands;
CREATE TABLE Husbands
(man CHAR(2) NOT NULL,
woman CHAR(2) NOT NULL,
ranking INTEGER NOT NULL CHECK (ranking > 0), PRIMARY KEY (man, woman));
DROP TABLE Wives;
CREATE TABLE Wives
(woman CHAR(2) NOT NULL,
man CHAR(2) NOT NULL,
ranking INTEGER NOT NULL CHECK (ranking > 0), PRIMARY KEY (woman, man));
INSERT INTO Husbands VALUES ('h1', 'w1', 5);
INSERT INTO Husbands VALUES ('h1', 'w2', 2);
INSERT INTO Husbands VALUES ('h1', 'w3', 6);
INSERT INTO Husbands VALUES ('h1', 'w4', 8);
INSERT INTO Husbands VALUES ('h1', 'w5', 4);
INSERT INTO Husbands VALUES ('h1', 'w6', 3);
INSERT INTO Husbands VALUES ('h1', 'w7', 1);
INSERT INTO Husbands VALUES ('h1', 'w8', 7);
INSERT INTO Husbands VALUES ('h2', 'w1', 6);
INSERT INTO Husbands VALUES ('h2', 'w2', 3);
INSERT INTO Husbands VALUES ('h2', 'w3', 2);
INSERT INTO Husbands VALUES ('h2', 'w4', 1);
INSERT INTO Husbands VALUES ('h2', 'w5', 8);
INSERT INTO Husbands VALUES ('h2', 'w6', 4);
INSERT INTO Husbands VALUES ('h2', 'w7', 7);
INSERT INTO Husbands VALUES ('h2', 'w8', 5);
INSERT INTO Husbands VALUES ('h3', 'w1', 4);
INSERT INTO Husbands VALUES ('h3', 'w2', 2);
INSERT INTO Husbands VALUES ('h3', 'w3', 1);
INSERT INTO Husbands VALUES ('h3', 'w4', 3);
INSERT INTO Husbands VALUES ('h3', 'w5', 6);
INSERT INTO Husbands VALUES ('h3', 'w6', 8);
INSERT INTO Husbands VALUES ('h3', 'w7', 7);
INSERT INTO Husbands VALUES ('h3', 'w8', 5);
INSERT INTO Husbands VALUES ('h4', 'w1', 8);
INSERT INTO Husbands VALUES ('h4', 'w2', 4);
INSERT INTO Husbands VALUES ('h4', 'w3', 1);
INSERT INTO Husbands VALUES ('h4', 'w4', 3);
INSERT INTO Husbands VALUES ('h4', 'w5', 5);
INSERT INTO Husbands VALUES ('h4', 'w6', 6);
INSERT INTO Husbands VALUES ('h4', 'w7', 7);
INSERT INTO Husbands VALUES ('h4', 'w8', 2);
INSERT INTO Husbands VALUES ('h5', 'w1', 6);
INSERT INTO Husbands VALUES ('h5', 'w2', 8);
INSERT INTO Husbands VALUES ('h5', 'w3', 2);
INSERT INTO Husbands VALUES ('h5', 'w4', 3);
INSERT INTO Husbands VALUES ('h5', 'w5', 4);
INSERT INTO Husbands VALUES ('h5', 'w6', 5);
INSERT INTO Husbands VALUES ('h5', 'w7', 7);
INSERT INTO Husbands VALUES ('h5', 'w8', 1);
INSERT INTO Husbands VALUES ('h6', 'w1', 7);
INSERT INTO Husbands VALUES ('h6', 'w2', 4);
INSERT INTO Husbands VALUES ('h6', 'w3', 6);
INSERT INTO Husbands VALUES ('h6', 'w4', 5);
INSERT INTO Husbands VALUES ('h6', 'w5', 3);
INSERT INTO Husbands VALUES ('h6', 'w6', 8);
INSERT INTO Husbands VALUES ('h6', 'w7', 2);
INSERT INTO Husbands VALUES ('h6', 'w8', 1);
INSERT INTO Husbands VALUES ('h7', 'w1', 5);
INSERT INTO Husbands VALUES ('h7', 'w2', 1);
INSERT INTO Husbands VALUES ('h7', 'w3', 4);
INSERT INTO Husbands VALUES ('h7', 'w4', 2);
INSERT INTO Husbands VALUES ('h7', 'w5', 7);
INSERT INTO Husbands VALUES ('h7', 'w6', 3);
INSERT INTO Husbands VALUES ('h7', 'w7', 6);
INSERT INTO Husbands VALUES ('h7', 'w8', 8);
INSERT INTO Husbands VALUES ('h8', 'w1', 2);
INSERT INTO Husbands VALUES ('h8', 'w2', 4);
INSERT INTO Husbands VALUES ('h8', 'w3', 7);
INSERT INTO Husbands VALUES ('h8', 'w4', 3);
INSERT INTO Husbands VALUES ('h8', 'w5', 6);
INSERT INTO Husbands VALUES ('h8', 'w6', 1);
INSERT INTO Husbands VALUES ('h8', 'w7', 5);
INSERT INTO Husbands VALUES ('h8', 'w8', 8);
INSERT INTO Wives VALUES ('w1', 'h1', 6);
INSERT INTO Wives VALUES ('w1', 'h2', 3);
INSERT INTO Wives VALUES ('w1', 'h3', 7);
INSERT INTO Wives VALUES ('w1', 'h4', 1);
INSERT INTO Wives VALUES ('w1', 'h5', 4);
INSERT INTO Wives VALUES ('w1', 'h6', 2);
INSERT INTO Wives VALUES ('w1', 'h7', 8);
INSERT INTO Wives VALUES ('w1', 'h8', 5);
INSERT INTO Wives VALUES ('w2', 'h1', 4);
INSERT INTO Wives VALUES ('w2', 'h2', 8);
INSERT INTO Wives VALUES ('w2', 'h3', 3);
INSERT INTO Wives VALUES ('w2', 'h4', 7);
INSERT INTO Wives VALUES ('w2', 'h5', 2);
INSERT INTO Wives VALUES ('w2', 'h6', 5);
INSERT INTO Wives VALUES ('w2', 'h7', 6);
INSERT INTO Wives VALUES ('w2', 'h8', 1);
INSERT INTO Wives VALUES ('w3', 'h1', 3);
INSERT INTO Wives VALUES ('w3', 'h2', 4);
INSERT INTO Wives VALUES ('w3', 'h3', 5);
INSERT INTO Wives VALUES ('w3', 'h4', 6);
INSERT INTO Wives VALUES ('w3', 'h5', 8);
INSERT INTO Wives VALUES ('w3', 'h6', 1);
INSERT INTO Wives VALUES ('w3', 'h7', 7);
INSERT INTO Wives VALUES ('w3', 'h8', 2);
INSERT INTO Wives VALUES ('w4', 'h1', 8);
INSERT INTO Wives VALUES ('w4', 'h2', 2);
INSERT INTO Wives VALUES ('w4', 'h3', 1);
INSERT INTO Wives VALUES ('w4', 'h4', 3);
INSERT INTO Wives VALUES ('w4', 'h5', 7);
INSERT INTO Wives VALUES ('w4', 'h6', 5);
INSERT INTO Wives VALUES ('w4', 'h7', 4);
INSERT INTO Wives VALUES ('w4', 'h8', 6);
INSERT INTO Wives VALUES ('w5', 'h1', 3);
INSERT INTO Wives VALUES ('w5', 'h2', 7);
INSERT INTO Wives VALUES ('w5', 'h3', 2);
INSERT INTO Wives VALUES ('w5', 'h4', 4);
INSERT INTO Wives VALUES ('w5', 'h5', 5);
INSERT INTO Wives VALUES ('w5', 'h6', 1);
INSERT INTO Wives VALUES ('w5', 'h7', 6);
INSERT INTO Wives VALUES ('w5', 'h8', 8);
INSERT INTO Wives VALUES ('w6', 'h1', 2);
INSERT INTO Wives VALUES ('w6', 'h2', 1);
INSERT INTO Wives VALUES ('w6', 'h3', 3);
INSERT INTO Wives VALUES ('w6', 'h4', 6);
INSERT INTO Wives VALUES ('w6', 'h5', 8);
INSERT INTO Wives VALUES ('w6', 'h6', 7);
INSERT INTO Wives VALUES ('w6', 'h7', 5);
INSERT INTO Wives VALUES ('w6', 'h8', 4);
INSERT INTO Wives VALUES ('w7', 'h1', 6);
INSERT INTO Wives VALUES ('w7', 'h2', 4);
INSERT INTO Wives VALUES ('w7', 'h3', 1);
INSERT INTO Wives VALUES ('w7', 'h4', 5);
INSERT INTO Wives VALUES ('w7', 'h5', 2);
INSERT INTO Wives VALUES ('w7', 'h6', 8);
INSERT INTO Wives VALUES ('w7', 'h7', 3);
INSERT INTO Wives VALUES ('w7', 'h8', 7);
INSERT INTO Wives VALUES ('w8', 'h1', 8);
INSERT INTO Wives VALUES ('w8', 'h2', 2);
INSERT INTO Wives VALUES ('w8', 'h3', 7);
INSERT INTO Wives VALUES ('w8', 'h4', 4);
INSERT INTO Wives VALUES ('w8', 'h5', 5);
INSERT INTO Wives VALUES ('w8', 'h6', 6);
INSERT INTO Wives VALUES ('w8', 'h7', 1);
INSERT INTO Wives VALUES ('w8', 'h8', 3);
DROP TABLE Wife_perms;
CREATE TABLE Wife_perms
(wife CHAR(2) NOT NULL PRIMARY KEY,
tally INTEGER NOT NULL);
INSERT INTO Wife_perms VALUES ('w1', 1); INSERT INTO Wife_perms VALUES ('w2', 2); INSERT INTO Wife_perms VALUES ('w3', 3); INSERT INTO Wife_perms VALUES ('w4', 4); INSERT INTO Wife_perms VALUES ('w5', 5); INSERT INTO Wife_perms VALUES ('w6', 6); INSERT INTO Wife_perms VALUES ('w7', 7); INSERT INTO Wife_perms VALUES ('w8', 8);
This query produces the correct results:
SELECT W1.wife AS h1, W2.wife AS h2, W3.wife AS h3,
W4.wife AS h4, W5.wife AS h5, W6.wife AS h6, W7.wife AS h7, W8.wife AS h8
FROM Wife_perms AS W1, Wife_perms AS W2, Wife_perms AS W3,
Wife_perms AS W4, Wife_perms AS W5, Wife_perms AS W6, Wife_perms AS W7 , Wife_perms AS W8
WHERE (W1.tally + W2.tally + W3.tally + W4.tally +
W5.tally + W6.tally + W7.tally + W8.tally) = 255
AND NOT EXISTS
(SELECT *
FROM Husbands AS H1, Husbands AS H2,
Wives AS W1, Wives AS W2
WHERE H1.man = H2.man
AND H1.ranking > H2.ranking
AND (H1.man || H1.woman) IN
('h1' || H1.wife, 'h2' || H2.wife,
'h3' || H3.wife, 'h4' || H4.wife,
'h5' || H5.wife, 'h6' || H6.wife,
'h7' || H7.wife, 'h8' || H8.wife)
AND H2.woman = W1.woman
AND W1.woman = W2.woman
AND W1.ranking > W2.ranking
AND (W1.man || W1.woman) IN
('h1' || H1.wife, 'h2' || H2.wife,
'h3' || H3.wife, 'h4' || H4.wife,
'h5' || H5.wife, 'h6' || H6.wife,
'h7' || H7.wife, 'h8' || H8.wife));
The final results are:
h1 h2 h3 h4 h5 h6 h7 h8 ================================ w2 w4 w3 w1 w7 w5 w8 w6 w2 w4 w3 w8 w1 w5 w7 w6 w3 w6 w4 w1 w7 w5 w8 w2 w3 w6 w4 w8 w1 w5 w7 w2 w6 w3 w4 w1 w7 w5 w8 w2 w6 w3 w4 w8 w1 w5 w7 w2 w6 w4 w3 w1 w7 w5 w8 w2 w6 w4 w3 w8 w1 w5 w7 w2 w7 w4 w3 w8 w1 w5 w2 w6This example was taken from Niklaus Wirth's Algorithms + Data Structures = Programs (Prentice-Hall; 1976), and you can also get a small booklet by Donald Knuth on the problem.