August 10, 2021

The process of building a successful Datawarehouse

There are so many ways to building your Data warehouse, choosing the right way is always a difficult task. Here is your end to end solution to building a data warehouse.

Written by

Harsh Gupta

The process for building an enterprise data warehouse begins with identifying its business drivers, or causes that motivate the compilation of information in a repository. Critical success factors (CSFs) are derived from these drivers and represent the areas within an organization’s environment where improvements can be made to enhance performance, profitability, and growth.

To begin building an enterprise data warehouse and achieve its goals, you need to know what information sources are available within your organization’s environment. These sources may include corporate databases, online transaction processing (OLTP) modules, management information systems (MIS), electronic data interchange (EDI) transmissions and legacy systems; they also may include external data sources such as financial market intelligence, demographic information and even employee sentiment.

To determine what types of information sources are available within your organization’s environment, you need to select a conceptual model for your enterprise data warehouse based on the goals you have set for it. Selecting the correct organizational model is an important decision that requires careful analysis. There are two basic models from which to choose: transactional or hierarchical. These models can be extended with additional functionality, resulting in more hybrid EDW implementations.

The major differences between the two models relate to how the information in them is organized. In a hierarchical model, data is organized according to a target or source business entity. The hierarchical relationships are defined by the hierarchy of the enterprise’s key business processes (e.g., selling -> order entry -> billing), which will typically define parent/child relationships between key business entities.

The transactional model contains information related to interrelated transactions in real time and can easily be identified as the blueprint for continuous business processes such as manufacturing automation, accounting and human resources management. In this environment every transaction that occurs within an organization has an associated unique identifier used to “tie” data from different sources together for analysis and reporting purposes. This enables all events pertaining to one object to be linked through time using these identifiers, regardless of their location or data format.

The next step is to develop a business case for your EDW, which defines the project scope and describes the implementation steps for success. The business case should describe why you are building an enterprise data warehouse in general and within your organization specifically.

A good place to start a discussion about implementing an EDW is by using existing organizational needs/goals as the key criteria in building a business case for it. Understanding your business’s goals will help you determine what information sources are available to support its progress toward achieving those goals and how they can be used in an EDW environment. For example, if one of your organization’s performance indicators is “expanding into new markets”, there may be data sources that will impact this goal: call center logs, retail sales records or even weather-related data from local government agencies.

After you have gathered and analyzed all available information, it is time to develop a data-warehouse strategy that reflects your organization’s goals and objectives for implementing an EDW. An organization’s needs should be modeled as broadly as possible to ensure that the results of the analysis will support ongoing business initiatives. The following steps can help you to determine which model provides the best fit for your company’s requirements:

* Determine what types of information are available within your environment (e.g., call center logs) and how they are interrelated;

* Determine how those data sources will be represented in your EDW (e.g., a relational database);

* Determine the key entities and relationships that represent them (e.g., account, transaction, order);

* Build a logical model reflecting the information requirements for your business;

* Verify the elements of each diagram against actual business needs; and

* Perform any necessary modifications for compliance with corporate standards. The logical model is usually defined at a high level, showing only major data objects and their relationships:

This design reflects only part of the organization’s complete view of data while focusing on the specific subject areas most relevant to its needs. These concepts are then used to begin building an enterprise-wide repository containing all the company’s important data, regardless of its format or location. The model should be conceptually sound and traceable to the organization’s detailed data-management requirements in terms of:

* Data content (e.g., format);

* Data sources;

* Business rules and constraints; and

* Implementation options for each element of the logical design. A variety of modeling techniques can be used to build an EDW. Depending on your organization’s goals, you may consider various levels of complexity when deciding which one is best: basic, implemented or complete. Each level reflects a different focus on information models that will ensure successful implementation projects within your business environment:

Basic model — This type conveys major information areas and relationships, but does not go into great detail about any particular element. It becomes the basis for determining what information needs to be documented in greater detail, or for building a later version of the model with more refined information:

Implemented model — This type uses more thorough documentation that describes all data entities and elements within an entity as well as their relationships:

Complete model — This type details every aspect of your business’s structures and processes, while also providing additional contexts such as individuals involved. The models can be used at various stages in the implementation process of an EDW to establish best-practices guidelines and objectives appropriate for each step. Here is one example how these requirements can be tailored to specific project phases throughout its lifecycle:

  • Step 1 — Goals elicitation;
  • Step 2 — Conceptualization and platform selection;
  • Step 3 — Business case and project roadmap;
  • Step 4 — System analysis and data warehouse architecture design;
  • Step 5 — Development and stabilization;
  • Step 6 — Launch, including the transition to production support for ongoing operations. This methodology ensures that you fully understand all of your organization’s information requirements before attempting a full-featured implementation plan. It allows for more accurate planning during each phase of development from concept to completion.

Arrow left icon.
Return to Blog