
|
Just a short time ago, the hottest topic in the information systems world was enterprise data warehousing. When a few data warehouse success stories began to trickle out, it seemed almost every company began purchasing monolithic hardware platforms that were destined to contain gigantic databases that held every scrap of relevant information. Once constructed, data would be mined out of the warehouse by a companyęs various business units, and information would be unlocked and used like never before. Well, the honeymoon is over with respect to implementing enterprise data warehouses, and reality has set in. While warehouse success stories are still readily available, companies have come to realize that the process of analyzing, collecting, cleansing, and making data available to decision makers isnęt a cakewalk. And though the vendors who sell data warehouse hardware and software continue to plow full-steam ahead, a number of companies are still waiting patiently for their data warehouses to pay off.
Enter the data mart. Usually smaller and more manageable than warehouses, data marts make sense for a lot of companies where the amount of data analysis work is confined to specialized departments such as marketing or finance. In many cases data marts can also be constructed and put in place faster than enterprisewide warehouses and offer significant cost savings, both in terms of the hardware and software required and in the time needed for implementation. Because of these benefits, a number of vendors are offering complete data mart solutions; the craze has even caught the eye of some of the major RDBMS players, who have since decided to enter the fray. Broadbase Information Systems Inc. is a startup company that is square in the middle of the data mart arena. It has put together an impressive package to help administrators and developers alike build marts that work.
The most current version of Broadbase that I was able to review was 1.1. I tested the data mart suite on a Compaq Windows NT 4.0 server that was equipped with a single Pentium Pro 200MHz processor and 64MB of RAM. The installation process consisted of several parts, the first being the creation of the Broadbase server. The setup requires that a Windows NT administrative ID perform the install due to Broadbaseęs integration with the NT operating system and the creation of NT services. After the Broadbase server installation was complete, I installed all the ODBC drivers (supplied by Intersolv Inc.) that Broadbase uses for database connectivity. Finally, I installed the BrioQuery for Broadbase product, which is an extremely robust OLAP query toolset marketed by Brio Technology Inc. (www.brio.com) and comes bundled with Broadbase. When all was said and done, the installation consumed a little more than 50MB on the server.
Once Broadbase is installed, an initialization routine must be invoked to load Java classes to the server and create the data dictionary objects for the database. Java is used throughout Broadbase as both a development language and the foundation of many of the productęs components, including the GUI utilities that are used to interact with the database. And speaking of databases, thatęs exactly what the core Broadbase offering is ę a database engine. Because I had no exposure to Broadbase prior to version 1.1, I had thought the product was a collection of data movement, scrubbing, and development tools like many other vendors have and nothing more, but I was wrong. Although Broadbase includes all these utilities, they revolve around the Broadbase server, which is a relational/OLAP database server that is used as the actual data mart. Upon discovering this, I thought that the Broadbase folk had done some clever marketing. By labeling their product as a data mart solution instead of just another database, they can compete much better against Goliaths like Oracle. Once I dug deeper into the tool, though, I discovered that Broadbase was actually designed from the ground up as a data mart package, and the fact that the server engine competes quite nicely against all of its analytical database rivals is an additional bonus.
Users of DBMSs such as Microsoftęs SQL Server will feel right at home with Broadbase. Although I wondered if administrators and developers would want to learn the ins and outs of another database in addition to their existing enterprise engines, I found that Broadbase functions very much like the SQL Servers and Oracles of the world, which makes the learning curve more or less inconsequential. Starting the server requires nothing more than the right mouse click in the Broadbase administrator followed by entering an administrator ID and password (just like the SA ID in SQL Server). Once started, Broadbase is ready to build a data mart.
Like many other databases, Broadbase partitions objects within the server through the use of schemas, which act as containers for tables, and other database related objects. For example, you may have one schema called "Marketing" and another called "Finance," each of which would naturally contain data specific to those business units. Data is stored within Broadbase tablespaces, which are exactly like Oracle tablespaces ę logical named devices that use physical storage from the server. Defining tablespaces is easy through the administration tool, although the GUI was the slowest in terms of operation of any in the toolset.
Because nearly all data marts are constructed from existing operational or legacy data, Broadbase contains specific components that simplify the connection and capture of remote data. An administrator first creates what the product terms Link objects to remote databases or files (Broadbase can work with text files as well as typical RDBMS data, which is useful for those working with mainframe flat files or text unloads from other databases). I was able to create ODBC links to my Oracle 7.3 databases quickly on Windows NT and Unix as well as several Microsoft SQL Server 6.5 databases. After defining links, the next step in the construction of a data mart is the creation of sources, which are metadata views of remote data objects (tables or flat files). Sources are easily defined by connecting to a remote RDBMS via one of the links and browsing through the database in an explorer-style interface. The desired data objects can then be dragged into Broadbase and entered as metadata sources. Sources can then be used as templates for the creation of Broadbase tables. In keeping with Broadbaseęs drag-and-drop interface, sources may be dragged into the table area of the administration tool, which then invokes the table editor. One thing to keep an eye on when building tables is that Broadbaseęs own datatype limitations will be enforced. For example, an Oracle field defined as a number(28) had to be truncated to 19 digits in Broadbase.
Once the Broadbase data links, tablespaces, and object structures have been created, an administrator is ready to load and massage data from the defined sources. Loading data can be performed interactively or, for large intensive loads, in batch with the Express Loader utility. The loader can be invoked graphically by dragging one of the previously defined source views over a Broadbase table. Express Loader contains many functions and routines that assist in the selection and transformation of data into the Broadbase server; for example, a function exists to convert a columnęs values to uppercase. If DBAs donęt find a function to fit their specific situation, they may build a custom solution using Java. Even when performed interactively, the loader utility moves data along at a brisk pace. I was able to load a 21,000-row Oracle table into Broadbase in only 15 seconds.
The data is now ready to be queried and analyzed. I was impressed with the speed at which the Broadbase database (whose architecture I donęt have space to discuss) operates. Several test queries I tossed at the database through Broadbaseęs GUI SQL tool produced response times that were every bit as fast as an Oracle 7.3 database that contained the identical data and sits on the same server.
After information has been successfully transferred to Broadbase, database designers and developers can begin to use the strong set of utilities and functions contained within the product in order to produce the required data mart results. One of the key components within the toolset to assist with this is the SQL Workbench (see Figure 1), which is a drag-and-drop graphical interface that can be used to perform many Broadbase tasks, including the creation of any database object.
Data mart developers can use the workbench to perform complex analysis and data transformations by invoking any of the built-in, Java-based data routines. For example, I was easily able to execute historical trend analysis on my data by selecting the time dimension function. By dragging and dropping the data columns that I wanted to see and specifying date and time parameters, I quickly built query results that could then be stored as view objects within the database. Users wishing to see similar data could simply select the view columns without ever knowing about the Java routines that created the result. Again, if developers donęt find functions that meet their requirements, they can build custom routines using Java and stored within Broadbase for future use.
One thing I learned quickly about development in Broadbase: The product is quite an advanced toolset, and although a lot can be done from a visual standpoint by dragging and dropping objects, a seasoned development staff will be needed to construct the routines and Broadbase objects to meet the demands of data mart users. I got the feeling, though, that the learning curve of the tool would not be that bad once the DBA had experimented with a small project or two.
The good thing about Broadbase is, regardless of its OLAP abilities, it is still a standard RDBMS adhering to SQL 92 standards. The multidimensional features are performed through preaggregations of data (cubes) and are managed through Broadbaseęs cost-based optimizer. This being the case, most database developers should get the hang of the data mart toolset quickly.
As far as end-user access goes, the robust BrioQuery analysis tool that is bundled with Broadbase provides an excellent ad hoc query interface that can be used to slice and dice data. BrioQuery is packaged with Broadbase to expedite end-user activity; however, the database is accessible with any SQL/ODBC-enabled query product. Using the Broadbase ODBC driver, users may rely on any of their favorite, tried-and-true query tools to gain access to information contained within Broadbase
Even though I was using version 1.1, the Broadbase product suite was quite solid ę unusual for a first offering and especially one that handles the complex operations that are required for data marts. I did not experience any protection faults or any other abnormal terminations when performing data movements, conversions, or other database tasks.
The only flaws I found in my tour through Broadbase were in the areas of documentation and error handling. When using the manuals that accompanied the toolset, the index referred me many times to pages that either didnęt exist or were incorrect for the topics I was trying to look up. In terms of error handling, I encountered several SQL errors from within the interactive query tool, which I could not locate in any documentation. Broadbase technical support confirmed that they currently do not have an error message manual or online help for errors encountered inside the tool. This surprised me because new users of such a sophisticated tool will likely be left stranded when they hit roadblocks like I did. A product of this nature should offer much more in the way of online help. Sources at the company told me that error documentation is in the works.
Currently supporting only Windows NT at the server level, Broadbase has a couple of different methods for pricing its data mart suite. The Broadbase Workgroup Server starts at $49,950 and includes 32 named and five concurrent users. For larger installations, the Broadbase Department Server can be purchased at a starting price of $99,950, which includes 65 named and 20 concurrent users.
Although I initially wondered whether data mart builders would want to gamble on a database that wasnęt offered by one of the big RDBMS heavyweights, I have to say that I am impressed with how well Broadbase is designed and how solid it is, especially for a first version. If youęre casting around for a data mart solution, Ięd recommend that you get an evaluation copy of Broadbase and put it through its paces to see if it can handle your data mart requirements.
