DBMS

Sagent Data Mart Solution

By Tom Couchman
DBMS, July 1998
  • Sagent Technology Inc., 750 Menlo Ave., Suite 300, Menlo Park, CA 94025; 650-833-6800; www.sagenttech.com.
  • Pricing: Varies, depending on modules included, from $40,000 to $150,000.
  • System Requirements: Client: Microsoft Windows 95, Windows NT 3.51 or higher, Pentium processor, 16MB of RAM (32MB recommended), 35MB of available disk space, TCP/IP or Named Pipes client software, database middleware client libraries (such as the appropriate ODBC driver for the server database). Server: Microsoft Windows NT 3.51 or higher, Service Pack 4, Pentium processor. 32MB of RAM (128MB recommended), 38MB of available disk space, at least 40MB of disk space allocated for the Repository database (100MB recommended), at least 20MB of disk space allocated for the Repository database log (50MB recommended). Web Link Server: Microsoft Windows NT version 4.0, a Web server that supports 32-bit COM and ASP (Microsoft IIS version 3.0 or later, Netscape FastTrack Server, or Netscape Enterprise Server recommended).
Sagent Data Mart Solution is a comprehensive decision-support environment designed to allow rapid construction, population, and maintenance of data marts as well as basic reporting from those data marts. Sagent WebLink facilitates use of a Web browser as a front end to the Sagent reporting tools.

Two trends currently in vogue in the information systems business explain the direction many decision-support vendors are taking their products. First, there is abundant evidence that top-down, enterprisewide, swallow-the-elephant-in-one-gulp data warehousing projects almost never work. Data marts, tightly focused on a single subject area, are by far the most likely to make customers and accountants happy. Second, most companies have found that they either do not have the architecture and integration expertise required to blend off-the-shelf niche tools into a coherent and workable solution to a business-IS problem, or they do not want to expend the effort. Consequently, purchasing integrated suites of tools is now in favor.

Sagent Data Mart Solution, Sagent Technologyıs answer to the product demands generated by these two trends, includes the following components:

A companion product, Sagent Web Link, is a Web browser front end to Sagent Information Studio that lets Web clients generate and access reports from the Sagent tools.

The Sagent tool suite ı with the exception of Sagent Analysis, which is new in release 3.0 ı and Sagent Web Link were reviewed in the February 1997 DBMS. For the sake of completeness I will describe the entire toolset, trying to avoid mere repetition of the contents of that previous article. I will provide more detail on Information Studio and Sagent Analysis, which was not available last year.

Planning and Installation

Architecturally, the Sagent product is very flexible. The three principal server components are the repository; the data mart database, which is built in one of the supported commercial DBMSs; and the DC agent. There can be multiple instances of each of these components installed on a single server. Based on the number of instances of each component, anticipated database size, network considerations, number of initial users, OLTP data sources, reporting requirements, and anticipated rate of growth, an administrator must decide whether to put all three components on the same box or how to distribute them to multiple servers. Only the DC agent must reside on the NT server; OLTP sources, data marts, and the repository can sit on any platform accessible via TCP/IP or named pipes.

The repository contains metadata describing data sources and sinks, data transformations, and the identity and rights of users and user groups. Repository data can be stored in Microsoft SQL Server, Oracle, Sybase, or Informix databases.

Databases hold the results of data extraction from OLTP sources. In addition to the four DBMSs listed in the preceding paragraph, data marts may be stored in DB2 or Red Brick. For either a repository or a data mart, the appropriate client middleware libraries must, of course, be present.

DC agents are dual-function demons. They move data around the data mart environment, and they manage client services, such as connections and report viewing. Because Sagentıs DC agents are designed to take advantage of the multitasking and multithreading capabilities of Windows NT, multiple agents can run in parallel.

I installed all three components of the server software ı repository, data mart, and DC agent ı on a single Windows NT 4.0 server with Service Pack 3 installed running Microsoft SQL Server 6.5. The machine was a 166MHz Pentium with 128MB of memory. To eliminate potential network problems, I installed the client software on the same machine. The installation went quite smoothly with one exception: Some of the string declarations in the vendor-provided DDL for tutorial databases were inconsistent with the data actually provided, so one large set of databases did not load. This failure produced a minor inconvenience when I was trying to follow the tutorials.

Installing Web Link was not so easy, mainly because the installer needs to know how the Web server is configured to make the installation successful. Getting Web Link to work is not a job for a novice! The task was complicated by the fact that the configuration instructions referenced an outdated version of the Internet server. It took some expert guessing to set the virtual directory for Active Server Pages correctly.

Kudos to the Tech Writers

In evaluating the components, I took my usual two-pronged approach. I navigated, with a fair amount of success, most of the functions and tool windows. If I got a bit lost, I went back to the manuals and followed them closely for a while before venturing on my own again. I found the documentation first rate. The manuals combine technical how-to, data warehouse advice, and tutorials into a single stream that flows remarkably well, given the scope of the toolset and the effort involved in juggling the three objectives.

Scores for Unsung Heroes

Like the offensive line on a football team, IS tool administrators only get noticed when something goes wrong. Sagent Admin provides them with the tools to remain blessedly anonymous. An administrator can establish, tune, and configure environment services (for example, broadcasting, multiple licensing, DC agents, cache, and swap files), manage the elements of a repository (more on this in next paragraph), and provide the appropriate level of access to information users.

There are five types of elements in the repository:

I found Sagent Admin to be relatively powerful and easy to use. My only complaint was that the Scheduler does not actually let the administrator do any scheduling or rescheduling, but only cancel execution of a scheduled Plan. Scheduling to take into account changes in workloads at various times of the day and week is a role for which I think the administrator ought to have responsibility.

A Portrait of the Artist in the Design Studio

Most creative work within the Sagent environment is done by technicians or savvy end users in Design Studio, which provides the tools for implementing Views and Plans.

To extract data from a source, you create a BaseView for a source database or fit a Transform for the file type to the file layout for a simple flat file. You can also create a BaseView of the target data mart then use the Data Flow Editor to formulate a Plan consisting of as many Transforms as necessary to prepare the data for insertion into the target data mart. The rich library of Transforms provided with the tool includes such functions as join, key generation, key lookup, sorts, and database loaders. Step by step, you adapt one after another of these Transforms to the task at hand by entering the required parameters, then link them into a Plan with the Data Flow Editor. I found this graphically oriented creation and management of Plans to be the most attractive feature of Design Studio and one of the most powerful elements of the Sagent tool set. (See Figure 1.)

Once the data is safely in the data mart, you build MetaViews to let different users or groups see various subsets or combinations of the data. Some of these MetaViews might involve fairly complex queries, aggregations, joins, and manipulations of the data, so there is a Plan associated with the transformation of BaseView data into a MetaView. For an information user, the Snap created by execution of the Plan that pulls data into the MetaView is the report.

From Data to Information

DC agents manage client connections to the Sagent environment. At the client, the user runs Information Studio to access authorized MetaViews and run Snaps. Power users may be given authority to develop their own MetaViews, Plans, and Snaps. Anyone who is authorized can share those developed by others. You can use a MetaView to develop other MetaViews and a Snap to develop other Snaps, either for one-time or continued use. Execution of a Snap generates a report in the stolid, unexciting grid format. Adding a Crystal Reports Transform will send the same output to a more attractive package.

More pulse-pounding excitement is available with the use of the newly released Sagent Analysis. Starting with a Plan, you can add on-the-fly dimensionalizing, converting the Snap into a "data cube" such as you would see in a classic OLAP environment. You can select rows or columns for aggregation, creating a "crosstab" report in which cells containing numeric data are totaled down and across. Adding more dimensions to the mix creates a drill-down report. While the analysis tool itself is limited to the aggregate functions of sum, count, min, max, average, and graphical comparison, you can create a Plan with complex mathematical functions to process the data before display.

But slicing and dicing is only part of the pleasure. Sagent Analysis contains charting capabilities that allow construction of any one of a score of graphical charts ı with the requisite bright colors, naturally. (See Figure 2.) You can create these charts, as with the other features of the Analysis tool, either on demand by the interested end user, or as part of the Plan so the report user never sees the bland grid. While the concept of a data cube is a little difficult for some people to grasp at first, it will be easy for anyone who has seen a data cube using familiar data to become adept at creating effective, informative crosstabs and colorful charts.

Untangling the Tangled Web

In a way, the most complimentary thing I could say about Sagent WebLink is that it makes accessing the Sagent reporting and analysis tools via a Web server look easy ı almost the same, in fact, as reports run from the Analysis client environment. Sagent WebLink adds another logical tier to the architecture. In this case, the tier is the intranet or Internet server, which can be completely independent of both the agent servers and the data mart database servers.

Sagent WebLink connects the userıs Web browser to the DC agent, the Sagent Repository, and the data against which reports are run. You click a hypertext link or button to run a report. WebLink fetches the controlling data from the Repository, calls the appropriate DC agents, marshals the results of the execution of the Plan or Snap, and sends the result back to the userıs browser as an HTML page. What the user sees, in other words, is the Snap from the actual execution of the Plan at that moment, not merely a prebuilt report. WebLink generates ActiveX controls for the browser, which means the user can perform some interactive analysis without the request having to be passed back to the server; for example, you can dynamically convert the display of a Snap to a grid, a crosstab, or a chart. (If the browser does not support ActiveX, WebLink generates non-dynamic HTML pages.) Sagent also provides a design-time control library, which can be downloaded from the companyıs Web site to allow customization of the appearance of Web-delivered reports. Using the Sagent tools and a Web page design package, you could, for example, create Web-enabled reports that conform to a company standard and do not show that they were generated by the Sagent tool.

To Sagent or Not to Sagent?

No tool, no matter how good, will work for everybody. But a product like this one can make a lot of folks happy. Could some of those happy people be at your place of business?

It is becoming harder and harder to ignore the rising mountain of facts in favor of data mart development. Well-directed data mart projects, firmly focused on one or two critical business questions, usually pay for themselves before the check to the vendor even hits the expense side of the ledger. The political victory can be just as quick and dramatic; on production day one, an executive VP sees the first real information reporting the IS department has ever produced and makes a paydirt decision based on something other than a gut feeling. Eliminating most of the tool integration work ı which Sagent has done ı makes a data mart project a relatively low risk. When you add the potential for giving nontechnicians who are willing to learn the data environment some of the simpler extraction, loading, and reporting tasks, the Sagent Solution begins to look like a viable way of dissolving some of that stubborn development backlog.

A few points, however, deserve sober consideration. Sagent is strictly a Windows NT-based tool. The company is working toward a Unix-hosted product, but it has not set a delivery date. Your IS organization ought to think hard before taking the NT leap if it has no other reason than the Sagent tools. A second point is scalability. I was not able to test performance on a multimillion-row data cube, but I would want to do so before bringing Sagent on board. Finally, for those of us who have not yet given up on the holy grail of enterprise data integration, products like this one are not going to help. Itıs not Sagentıs fault; no company can be responsible for how buyers use its product. But the Sagent Solution neither forces nor even encourages development of conformed dimensions across data marts. I suspect that the effect of distributing several copies of this toolset throughout an organization will be injection into the OLAP world of the application stovepipes that have bedeviled interprocess data sharing in the OLTP environment. In fact, with a lot of demitechnicians carving analysis niches for themselves on behalf of competitive executives, OLAP reporting could be more disintegrated than the mainframe world ever was!

That last point, however, is just another illustration of the fact that tools cannot manage themselves. What the Sagent Data Mart Solution can do is enough to make it worth a second look.



Figure 1. Sagent Data Mart Design Studio.


Figure 2. Sagent Data Mart Design Studio.


Tom Couchman is a data warehouse technical architect at J.D. Edwards in Denver. His special areas of expertise are data analysis and architecture. You can reach Tom at thomas_couchman@jdedwards.com.
What did you think of this article? Send a letter to the editor.


Subscribe to DBMS -- It's free for qualified readers in the United States
July 1998 Table of Contents | Other Contents | Article Index | Search | Site Index | Home

DBMS (http://www.dbmsmag.com)
Copyright © 1998 Miller Freeman, Inc. ALL RIGHTS RESERVED
Redistribution without permission is prohibited.
Please send questions or comments to dbms@mfi.com
Updated June 5, 1998