
During the 1957 World Series, Yankee catcher Yogi Berra noticed that Hank Aaron was holding the bat the wrong way. "Turn it around," whispered Berra, "so that you can see the trademark!" Aaron kept his eye on the mound and replied, "Didnıt come up here to read. Came up here to hit."
Those of us who have developed data marts can relate to getting bogged down in how weıre doing things rather than what the end result should be. And weıve all worked on projects where thereıs someone in the dugout poring over strategy and suggesting revisions to the game. Trouble is, by then itıs usually too late; the ballıs in motion and the bullpenıs empty. Scoping data marts correctly up front and using the results as input into project planning and staff recruiting represent an essential step toward successful data mart delivery.
Weıve all seen the signs of a project in trouble. The project manager skulking glumly around the hallways, avoiding eye contact. The executive sponsor burning the midnight oil as she recalculates the budget for the sixth time. The development staff wearing smug, I-warned-them-but-they-didnıt-listen smirks. And angry end users huddled in cubicles, muttering threats about outsourcing. Runaway projects and cost overruns are no longer the exclusive domain of enterprise data warehouses. As more companies adopt data marts to solve specific business intelligence problems, estimating timeframes, determining the necessary skill sets, and ensuring that the right activities make their way into the project plan have become guessing games, even for experienced practitioners.
Data marts have the same complexities as these larger data warehouses. They are more targeted, involving more homogenous data and fewer end users. However, the implementation steps, their order, and their execution are every bit as critical. Scoping a data mart project ensures against the same type of catastrophe that can befall an enterprise data warehouse or other large technology delivery project, and failures to plan these projects are the stuff of corporate folklore. Because the repercussions can be career-limiting, you canıt afford not to plan a data mart scoping activity prior to beginning development.
Scoping is more than just "vision and rationale," two terms commonly used as placemarkers for the activity. I once heard a data warehouse consultant from a "Big 6" firm confidently proclaim, "A few interviews with managers and end-users, a vision statement, and weıre off!" If the output of a scoping exercise is only a vision statement, then your data mart is already in jeopardy.
One of the most rampant misconceptions about scoping is that it is synonymous with requirements gathering. Scoping aims to determine the resources and timeframes necessary to complete the implementation. Requirements gathering, on the other hand, is a staple in any data warehouse project, large or small, and is part and parcel of development. You should be sure to separate scoping from the activities in the project plan. If scoping is done correctly, youıll have a good idea of how long the requirements gathering step should take, how long the overall development project will last, and what the optimal development skills are.
Indeed, scoping and project planning are also contiguous. While you cannot create a robust project plan without some amount of scoping, the inverse is not true. Scoping should precede the more tactical planning steps that determine explicit activities and skills, and should be the primary input in determining the overall data mart development schedule. (See Figure 1.)
If you know the business justification for your data mart but do not yet have the necessary technology infrastructure in place, scoping is crucial to deciding both the technology architecture and the specific products you will use to build the data mart. As with project planning, data mart technology selection should follow scoping. In short, scoping, if done correctly, will feed these and other data mart implementation processes.
Data marts address specific business problems. Unlike its data warehouse brethren, a data martıs business case is solid before development begins, the user community is known beforehand and usually concentrated in one or two departments, and business managers have an application and often a specific front-end tool already in mind. In cases where a data warehouse or another data mart already exist, a data mart development team may be established and ready to go.
With requirements well understood, IT organizations are often tempted to consider scoping a mere formality. They retain outside consultants to conduct five days of canned interviews with key stakeholders, document their findings, and subsequently shelve the resulting report, declaring scoping a fait accompli.
However, scoping prior to development is actually more critical with data marts than with data warehouses. Where a data warehouse may cost millions or even tens of millions of dollars, thus permitting months of analysis and iterative requirements gathering, data marts are often expected to be up and running quickly. In fact, time is money when developing data marts. Despite this need for rapid deployment, neither the development team nor the end users can afford quick assumptions about functionality or requirement boundaries.
Likewise, many of the earliest and largest enterprise data warehouses involved months of data loading from various source systems, populating a centralized platform with heterogeneous data from multiple transaction systems, and then finding various applications to leverage that data. (Remember the well-worn aphorism of a few years ago, "If we build it, they will come?") With data marts, the inverse is true: The intended application drives the data requirements.
Quick delivery expectations mandate a discrete scoping activity, since misplaced expectations at the onset of a data mart development result in rework, or worse, a visit back in time to the original business case. Unlike an enterprise data warehouse, an executive can easily dismiss a half-completed data mart project and withdraw funding. Getting it right the first time takes on new meaning for data marts: The first time may be the only time.
Finally, as the data mart becomes successful, it can become a cross-functional system, assuming more atomic data and consequently evolving into a data warehouse. The more you use the data mart, the greater the it's value proposition to the business. Consequently, the initial scope should be sound because it will serve as the foundation for more complex decision support over time. Scoping a new data mart project supplies business knowledge and functional boundaries that you can supplement later, after the first data mart application has been rolled out to users. Unless you are redefining the overall intent of your data mart, subsequent scoping exercises should ideally serve as "building blocks" atop the initial scope, which establishes purpose, and lay out adjunct functionality. Iterative scoping helps answer questions ı for example, "Should we include compensation analysis as part of the Sales Reporting data mart?" ı that you might not have considered during the initial scoping. Hence, the first scoping activity is the most important in your data martıs lifecycle.
No matter what methodology you use to implement your data mart, be sure to scope the five areas shown in Figure 1 separately to ascertain the amount of time, degree of skills, and delivery expectations necessary for each. Scoping each distinct step ensures that the activities within it are well defined, that there is no confusion between tasks, and that the project plan contains specifics.
Tactically, scoping mainly comprises interviews with key stakeholders for the data mart but can also include facilitated group sessions. Meet with the executive sponsor of the data mart project to identify these stakeholders. Before interviews or meetings, consider distributing to all stakeholders a general questionnaire that contains some core scoping questions. (See the sidebar) The responses will become the basis of your initial scoping assumptions. They may also pinpoint potential disagreements about the intent of the data mart and suggest separate development phases or application prioritization.
In addition to managers and key business stakeholders, be sure you have access to experienced developers. DBAs, data administrators, and programmers are often forgotten during scoping, but, they are your best sources for realistic opinions about data mart implementation time frames and functionality. A separate or addendum questionnaire given to technical staff can illuminate possible political or technical roadblocks.
Note that when performing scoping, itıs easy to slide into beginning the project. For instance, when interviewing end users about their goals for the data mart, many business analysts make the common mistake of gathering actual functional requirements. A colleague of mine once conducted a series of scoping interviews with field salespeople at a telecommunications company eager for a new Sales Reporting data mart. Transcribing his notes after a full day of interviews, he realized these excited future users had spent all their time parading existing revenue reports past him rather than answering his questions about timeframe expectations and business drivers.
Remember the Golden Rule of Scoping: Scoping is a high-level evaluation of what you will need to implement a successful data mart, not the implementation itself.
Scoping Requirements Gathering: Scoping requirements gathering means understanding what it will take to collect and evaluate business requirements once the data mart project is launched. Requirements gathering involves not only talking to end users about desired functionality ı the "low-hanging fruit" of requirements gathering ı but also interviewing management and IT staff, and comparing the expectations of all three.
At its simplest, gathering requirements for a data mart can mean interviewing key users and following up with a facilitated session where everyone agrees on the requirements. More complex requirements gathering may involve several requirements sessions, cultural change initiatives, or data analysis. In extreme cases, requirements definition may even involve strategic alignment work or best practices research.
Whether simple or complex, consensus is key in requirements gathering. During scoping, the level of consistency in your interviews should clue you in on how long requirements gathering should last once the project begins.
A data mart mission statement can provide immediate insight into its original intent and is a good indicator of the complexity and duration of requirements gathering. If a mission statement exists, use it as a springboard for estimating the business complexity. If not, formulate one based on the interviews, and factor time into requirements gathering for the creation of a formal data mart mission statement. A good technique for scoping requirements is to gauge discrepancies in answer to the question: "What is the need or problem this data mart is intended to solve?"
Although the question itself is simple, the degree of variance in answers is key. Scoping requirements gathering should also involve a rough assessment of the number of data subject areas involved in the initial implementation. Most data mart product vendors recommend implementing only one subject area at a time; however, this is often unrealistic. For instance, the Sales Reporting data mart mentioned above involved two major subject areas: Sales Activity and Product Revenues.
Scoping Database Design: The most frequent and hazardous mistake made in scoping data mart projects is lumping together requirements and design. (Many an ambitious young data modeler has donned the hat of a business analyst with disastrous consequences!) This mistake inevitably shortchanges requirements gathering, because practitioners often regard database design as the more crucial step and prematurely begin modeling data before requirements are complete. In fact, requirements gathering and design are two separate steps needing distinctly different skill sets and timeframes. Because they are delivered separately, you should scope them separately.
The principle question in scoping data mart database design is: What ı and who ı will it take to translate business requirements into a physical database design? Because design methods differ, consider the following questions when estimating database design resources. Clearly, you will not be sure of the answers until well into implementation, but try factoring your degree of certainty into your estimates:
Beware: A "yes" answer to even a few of the preceding questions could add weeks to a seemingly straightforward design activity.
The best rule of thumb for estimating the duration of data mart design is simply to estimate the number of tables and gauge the design's projected complexity. Play it safe and involve an experienced data modeler in scoping the design.
Scoping Data Sourcing: Data sourcing is by far the most underestimated activity in data mart development projects and, once begun, is usually fraught with problems. Various factors determine what you will need to extract, transform, and load data into a data mart. The trouble is most of these factors are unknown at the time of scoping, so itıs up to you and a few experts to make some educated guesses. Using the following factors as guidelines during scoping can save time and go a long way to prevent, or at least mitigate, those unpleasant sourcing surprises. Some questions to consider when scoping data sourcing include:
If you are implementing an independent data mart that is not connected to an enterprise data warehouse, data sourcing will be much more complex and is likely to involve additional time. The participation of an astute DBA, data administrator, or transaction system authority can make all the difference when scoping data sourcing.
Scoping Data Delivery: Many of the best-known data warehouse implementation methodologies stop once the data is loaded onto the target platform. This means that most analysts performing scoping either forget about or severely underestimate front-end application effort. Be sure and include time for the design, development, and deployment of end-user applications based on the following factors:
If you have not yet designated a front-end query or reporting tool, factor in at least two additional weeks for application technology selection. Remember to add extra time for every additional organization involved in the decision.
Scoping Administration and Support: If youıve been to a data warehousing conference lately, youıve heard the new battle cry: "A data warehouse is not just a technology, itıs a process." The folks waving the process banner are the same people forgetting to scope data mart administration and support activities. While you will only know specific formulae for these activities after requirements and design are complete, you can register the questions during your scoping exercise to get an idea of the complexity of the activities as well as which maintenance processes are already in place. Consider asking:
You can add your own questions here, as they largely depend on your environmentıs technology infrastructure, organization, and existing support procedures.
Scoping a data mart should include developing staffing requirements. While this does not mean choosing the actual people to perform the work, it does mean understanding the skills that will be required and determining whether those skills are available inside the organization.
You should know the data mart project sponsor before selecting development staff. Ideally, the sponsor works on the business side, is in charge of funding decisions, has the power to hire and fire, acts as a tie-breaker, and can play the role of subject matter expert or participate in requirements definition when needed. The project sponsor is the interface between the development team, represented by the project leader and upper management. She can even establish the cultural climate for the development team. The sooner she is appointed, the more solid the project scope will be.
Likewise, you may already have an idea of the business analysts, data modeler, and DBA for the data mart. Knowing actual resources is helpful, but resist assigning names to each activity until project planning takes place. What is important during data mart scoping is the identification of skill sets, not people.
As you interview subject matter experts during each phase of scoping, make a list of the ideal development skills for that phase. For example, as you pose questions about data sourcing, consider whether the sourcing activity will require expertise about a complex transactional system, whether the sourcing mandates custom extracts, and where the sourcing activity will most likely take place. Make a list of your assumptions, for example:
In this example, the foreseen qualifications to perform data sourcing include an understanding of detail data in the primary transaction system, knowledge of Cobol programming, ability to work either locally or remotely, and adequate writing skills.
At this early stage, most of your answers will only be guesses. The point is that getting them in writing forces the project manager and the project sponsor to consider skills before acquiring resources, to assess whether the skills are available, and whether it's practical or cost-effective to retain external consultants rather than assigning permanent resources to short-term tasks.
In addition to estimating development skills, donıt forget project management skills. Although the project manager is most likely already in place, note whether he should be more adept at administration and project planning or whether technical skills and development expertise are more important. Ask if he will work with corporate IT staff or with a business management steering committee. Given an estimated number of developers, can he alone track all the work? Answers to these questions can help to supplement the skills of the data mart project manager, one of the most important resources of all.
The primary deliverable of your scoping activity will be a Data Mart Scope document. As I mentioned before, this document will serve as a touchstone for other data mart development activities. In addition to being consulted by multiple different functions, the document will serve as the unit of "sign-off," meaning that each of the subject matter experts and managers interviewed will sign the scoping document, signifying approval of its contents. This ensures everyoneıs buy-in and prevents quarrels from erupting during requirements gathering. Be sure and include a signature page at the end of the document, with plenty of space for written remarks or modifications.
At a minimum, a good data mart scoping document contains the following information:
Data Mart Mission Statement. Itıs not the mission statement that matters as much as the consensus required to generate one. If the political climate is mild and there is general agreement about the data martıs overall objectives, draft a mission statement and ask everyone to review it during the interviews. If people do disagree, facilitate a session in which a mission statement is drafted, reviewed, edited, and reworked by everyone. (Make sure the executive sponsor attends.)
Data Mart Success Metrics. This is a list of factors that answers the question, "How will you determine whether the data mart is successful?" The list should contain specific functionality, such as, "The data mart will provide online weekly revenues by district, region, and metropolitan area." It might also contain performance requirements ("The data mart should provide answers as fast or faster than our legacy system"), usage expectations ("The data mart should provide remote access capabilities to support Web-based queries"), or concurrency expectations ("As many as 10 business users should be able to access sales data with no discernible response time impact").
List of Specific Development Tasks. Record the key activities within the five phases described here by referring to your development methodology or past project plans. This list will be supplemented with detailed tasks during project planning, but it should be thorough enough to drive skills requirements.
List of Required Development Skills. List projected job roles and corresponding skills that will be confirmed during project planning. At minimum, the job roles should include:
Note that, in reality, one person may fill more than one role. Each role should be associated with a list of related skills, serving as a guide for initial staff selection. Pay special attention to unique needs that might not pertain to current skill sets, for example, "The business analyst should ideally have a sound knowledge of activity-based costing."
Pro Forma Timeline. Donıt create a project plan or Gantt chart, but do estimate how long each phase might take, and add justification for your estimates. For example:
Be as thorough as possible while realizing that not until business requirements have been analyzed can you cement the project schedule.
Identified End Users. List the names and job functions of each end user interviewed in the scoping document. Alongside each name, note that userıs top business need for the data mart. This will be valuable information during development, when the inevitable "who said what" questions come up.
Next Steps. Be sure to list next steps and exceptions in the scoping document. These can range anywhere from "begin interviewing data modelers" in the best case to "conduct additional interviews with field sales" in cases where disagreements remain over the boundaries or ultimate functionality of the data mart. The Next Steps section lists the specific tactics, so nothing is overlooked before development begins. Consider documenting the next steps in tabular format, including possible resolutions that may hinge on company policy or individual decisions. A Next Steps table (see Table 1.) provides a quick look at high-impact "to do" items and can be used as a unit of discussion with the project manager and key stakeholders.
A word about technology selection here: Be sure to document technology requirements once scoping has been completed, ideally before the project begins. Now that you know what business users expect, youıll have a much better idea of the tools and products necessary to deliver necessary functionality. Itıs a mistake to wait until staff has been hired to question the existing network protocol or to weigh Web browsing against network access. The scoping document should be the primary source for deciding which technologies best meet the data martıs overall delivery objectives.
In the lingo of quality assurance experts, scoping your data mart is a step closer to conformance to requirements. Scoping establishes stakeholders at the outset of the project, registers expectations, and represents a tangible collection of the data martıs business objectives. When questions about basic functionality, business drivers, and success criteria arise during development - and they will -the scoping document will provide the answers. A good scoping document rarely sits on the shelf; it accompanies managers to meetings, provides data modelers with business input for design sessions, and lies open on developersı desks. In short, a scoping document does not exist simply to be read but rather to be used as a tool again and again. And, like Hank Aaronıs bat, scoping will guarantee you cover all the bases.

Figure 1. Scoping is a separate step.
| Sidebar: The top 10 questions to ask when scoping your data mart. |
|---|
|
1. What is the need or problem the data mart is intended to solve? 2. Are you getting this information in some other form (paper reports, standalone workstation, and so on) today? 3. What will be the first question you ask when the data mart is up and running? 4. What will you do with the information once youıve acquired it? 5. What is more important to you: the information you retrieve or the way you see that information? 6. Are other users in your area or department interested in the same reports that you are? 7. How will you know whether the data mart has been successful? 8. What do you see as potential barriers to the success criteria? 9. Once the data mart meets the success criteria, what other business problems might it solve? 10. How will the data mart benefit your company as a whole? |
| Next Step | Possible Resolutions |
|---|---|
| Decide whether finance will have access to the reports in phase I. | Provide finance with access to all reports. Provide finance with a subset of canned reports. Delay access until phase II of development. |
| Confirm current budget and company policy for retaining outside consultants. | Budget has been approved for outside resources. Signing of up to $25,000 for external consultants. Current freeze on external resources applies. |
| Research corporate policy on remote-access security. | Corporate policy mandates specific technology/firewall for remote access. Corporate policy denies remote access by certain job grades. Other? |
| Determine intraregional reporting regulations. | Discuss intraregional reporting restrictions with Sales VP. Determine whether data subsets or all data can be viewed by all regional sales staff. |