As we're all aware, no single technology is a panacea for all of our information technology problems and issues. We've been through the relational revolution, the client/server coup, and the object-oriented offensive, and now we see the same thing happening with the Internet insurgency. However, each of these technology waves introduces some very useful concepts and techniques that can often be engineered back into the other technologies. We can see this phenomenon in action in the way the object-oriented approaches are influencing the more traditional relational systems.
All of the leading relational DBMS vendors are extending their products with some object-oriented capabilities. The result is the uprising of the so-called "Universal Server." One of the features of this new type of DBMS server is its extensibility. You can extend the functionality of the DBMS server by adding your own user-defined datatypes (UDTs) and your own user-defined functions (UDFs).
But why would you want to do this? The answer is quite simple (and quite object-oriented): for reuse and sharing. You want to reuse the functionality you have developed over and over again, and you don't want to recode it in every application. You want to code it once in the DBMS server and share it among all of the applications accessing the database. The spatial datatypes offered by each DBMS vendor as practically a give-away UDT are an excellent example. The distance between two points (in the 2D or 3D plane) is a single-dimension numeric figure. When you work with spatial datatypes, you want to apply the distance function to two coordinate points and obtain a distance as a result, and you don't want to code Phythagoras's theorems in every application. You simply want to apply the function. However, this implies that someone had to implement 2D and 3D spatial datatypes, as well as the distance function, in your DBMS server. The problem is that implementing a new datatype in an extendible database server is not trivial.
The facilities provided by the major DBMSs to implement your own datatypes differ quite substantially. They range from letting you refine the definitions of the existing standard datatypes (almost like defining a domain) to the facilities to letting you "invent" and code your own datatypes and link them into the DBMS server.
In the Informix Universal Server (release 9.00 currently in beta testing) you can use so-called DataBlades to extend the capabilities of the database server. A DataBlade (a software module that you plug into the database) lets you do anything you care to code with any data. Informix provides a DataBlade Developers Kit (DBDK), a GUI tool for automating the creation of DataBlades that provides useful utilities for maintaining any existing DataBlades as well. The DBDK generates most of the SQL and C code necessary to create a DataBlade.
In IBM's DB2 (release 2.1.1) you can define UDFs to extend the set of SQL built-in functions. You can use these UDFs wherever a built-in SQL function can be used. You can also define UDTs that are derived from the existing datatypes. The UDTs can therefore refine the existing datatypes, but they cannot be used to extend or combine the standard datatypes. The UDTs can only be used in the system-provided or built-in functions if you have created UDFs for the UDT.
You can define your own UDFs in Oracle7 release 7.3 using Oracle's PL/SQL language. You can use the UDFs wherever you can use a standard SQL function. The UDFs must be created as top-level PL/SQL functions, or they must be declared with a package specification before being used in a SQL statement. The only correspondence to a UDT in Oracle is a user-defined type equivalence used in embedded SQL programs with the Oracle7 precompilers.
Sybase SQL Server 11 is rather lean on UDTs and UDFs. Although it allows user-defined datatypes, they must always be a subset of one of the standard datatypes. It is not possible to create complex datatypes as a combination of other datatypes. Sybase's extensive and extendable functional processing is based solely on its very powerful stored procedures. However, the stored procedures cannot be called like standard SQL functions can -- you can only activate them through explicit procedure calls.
Microsoft SQL Server 6.5 lets you define stored procedures as dynamic links to C library routines. This means that you can develop an API as a set of C functions that can perform complex operations on the standard datatypes. Although this feature does let you extend the server's functionality, it does not cover the definition of new UDFs, because you cannot call these functions from SQL statements. You can define new UDTs using Transact-SQL, but these are only subsets of the existing scalar and character datatypes. You cannot, however, define any functions or operators on the new datatype that can be used in SQL SELECT or WHERE clauses. This functionality is therefore more suited for defining domains on existing base datatypes than for defining new complex UDTs.
To implement a UDT in CA-OpenIngres, you define its internal representation, code all of the required functions, operators, and coercion routines, and then link them with the DBMS server. For a typical UDT, you must implement 16 functions. These routines are, for example, to compare two data items, to convert a data item to a representable datatype, to get an empty value, to calculate a hash key, to check that the data item satisfies its length specification, and many more -- that is, in addition to any special UDFs that you want to implement for your new datatype. Once you have coded the methods for storing and manipulating your new datatypes and have added them to your server, all of the end users and applications have SQL-based access to it. You can adapt the standard operators to have a special meaning in the context of your own datatypes. For example, where the "1" operator means concatenation for the string datatypes and addition for the numeric datatypes, you can implement it to mean vector addition for your spatial datatypes or component-by-component addition for complex datatypes such as complex numbers or matrices. In this way, it takes care of full operator overloading, dependent on the context of the datatype.
When you create a DataBlade in Informix, you must implement its UDTs, its user-defined routines (UDRs), and its user-defined access methods (UDAMs). There are three varieties of UDTs in Informix, namely:
CREATE DISTINCT TYPE yen AS money;
CREATE ROW TYPE myCustomer ( cust_num INTEGER, company CHAR(40), phone CHAR(12));
CREATE OPAQUE TYPE reallyLongInteger(internalLength = 8)
To create a UDF in CA-OpenIngres, you must specify its name, type, and internal ID in a predefined storage structure, implement the function's code, and link the function's definition to its implementation in another predefined storage structure. Finally, you must link the defined structures and the function's code with the DBMS server. After this, the function is automatically invoked at run time when the function call is encountered in a SQL statement.
An Informix UDR is a function written in SQL, Stored Procedure Language, C, or C++. A UDR is called from any SQL statement, just like a standard SQL function--for example:
SELECT myCfunction(18900, 12) FROM currencyTable.
You use Informix UDAMs to create new types of indexes and other ways to access your data. These are coded in C or C++ and are added to the Informix server. Once the UDAM has been added to the server, you can create an index using the defined access method--for example:
CREATE INDEX myIndex ON customers(cust_info.company) USING myBtree;
In IBM's DB2 (release 2.1.1) you can implement two types of UDFs: external functions written in a standard programming language, and sourced functions whose implementations are inherited from existing functions. DB2 provides a rich library of UDFs, including mathematical and scientific functions.
A DB2 UDF can be built using C++ commands, Visual Age, Java, or other language stored in a .bat or .exe file, but it is the easiest to use C++. The UDFs are compiled like stored procedures, but they do not require a connection to the database or a precompile phase. A UDF cannot contain any embedded SQL statements, but the application using the UDF does. The application may call the UDF just like it calls any standard SQL function. To define a UDF, you need to open a DB2 Command Line Processor (CLP) window and type: UDFname udf. After that, any application can use the function. UDFs can operate over all database types, including LOBs and distinct datatypes. A UDF can operate on data managed within or outside the DB2 database. For example, you can code a UDF to send an electronic message, update a flat file, or perform some other operation.
You create an Oracle UDF, called a user function or a stored function, as a top-level PL/SQL function by using the CREATE FUNCTION statement. You must be running Oracle7 with PL/SQL to be able to create Oracle UDFs. As part of the function's definition, you specify its name, its input and output parameters, its return value, and the body of PL/SQL code that must be executed. Alternatively, you can create an Oracle UDF as part of a package using the CREATE PACKAGE statement. Before you can create any UDFs, the user SYS must run the SQL script DBMSSTDX.SQL. To create a function in your own schema, you must have the CREATE PROCEDURE system privilege. To create a function in another user's schema, you must have the CREATE ANY PROCEDURE system privilege.
It is also interesting to note that CA-OpenIngres (and before that CA-Ingres) used the same mechanisms internally to implement the standard datatypes as are now available to its users to define their own user-defined datatypes in the CA-OpenIngres/Object Management Extension. By using a generic mechanism for all of the datatypes, whether they're standard or user-defined, all datatypes are defined and processed in the same way. This makes it just as easy for the CA-OpenIngres developers to implement new datatypes as for the users. This same philosophy of generic extendible datatypes probably led to the development of the DataBlades that were used in the Illustra database server and are now making their appearance in the Informix Universal Server release 9.00.
Third-party vendors are making use of this extensibility to the database servers to code libraries of datatypes that are applicable to their particular businesses. For example, a partnership between ASK (former owners of Ingres) and ESL (Ingres development partners) in 1992 led to the development of the Spatial Object Libraries currently available with CA-OpenIngres. This library gives you all of the spatial datatypes, such as points, lines, segments, circles, boxes, and polygons, with all of their associated functions, such as inside, intersect, area, length, perimeter, and distance. I foresee that many more datatype libraries, or DataBlades in Informix terminology, will become available in the near future, for a wide variety of business areas. These types of libraries enable developers to implement intelligent datatypes -- call them business objects if you want -- that encapsulate knowledge of the business-specific features and functions required from such a datatype.