RDA Corporation

Enterprise Data Warehouse Provides Effective Platform for BI Analysis

Enterprise Data Warehouse Provides Effective Platform for BI Analysis

Project at a Glance

RDA developed a phased-in enterprise data warehouse (EDW) for a global membership organization that integrated several sales and marketing data sources into a single data warehouse, allowing for more efficient business intelligence analysis and reporting functions across the organization.

About Our Client

RDA’s client provides best practices research, decision support tools, and executive education to many of the world’s leading corporations and not-for-profit institutions. The organization’s research addresses issues related to corporate strategy, operations, and general management.

Background

Prior to this project, our client performed aspects of their sales and marketing business intelligence analysis using a diverse set of data sources (Siebel CRM, web portals, Excel, and flattened data from the organization’s existing operational data store (ODS)). The data was then merged as well as accessed directly to meet business reporting needs. The processing was very time-consuming, did not provide for historic trending, and did not yield an effective platform for business intelligence analysis.

Solution Detail

Our client embarked on a project to build an enterprise data warehouse (EDW) based on the industry standard dimensional modeling approach. Additionally, the EDW project was segmented into specific phases in order to prioritize the effort and deliver benefits at the earliest opportunity as it relates to their business cycle. The key project segments include incorporating:

  • Phase 1: Siebel CRM data into the EDW (opportunities, contacts, memberships, event registrations).
  • Phase 2: Web portal data into the EDW (downloads, searches, logins).
  • Phase 3: Human Resource data into the EDW (personnel actions, compensation).
  • Phase 4: Employee recruiting efforts into the EDW (job openings, requisitions, applicants, interview rounds).

The EDW processes the above business data into a set of conformed facts and dimensions providing a consistent, efficient, and robust data store. For each phase, both transactional and aggregated facts are created to support trending and KPI reporting. For reporting, our client leverages the Business Object (BO) platform to create various BO universes for the diverse set of internal business customers.

To implement the above EDW phases, ourclient engaged RDA to develop a new business intelligence framework to support extracting business data in a more timely fashion, reduce manual processing, and provide an expanded analysis capability. RDA leveraged SQL Server 2005 Integration Services (SSIS) to build a robust and flexible extraction, transformation, and load (ETL) framework for processing business data into facts and dimensions.

RDA’s expertise using SQL Server 2005 proved that its new capabilities provide numerous opportunities to meet business problems experienced by our clients. Integration Services’ improved programming paradigm for ETL processing allowed RDA to rapidly create an infrastructure for building and populating an enterprise data warehouse.

Benefits

EDW has provided our client with several business analysis capabilities, including:

  • Creating an historic data store of key business data events for business reporting.
  • Presenting aggregated summaries for critical business stakeholders.
  • Providing periodic comparables (snapshots) of data to support trending analysis.
  • Providing a centralized and standardized set of business data to support core business processes from sales to human resources.

Technically Speaking

The EDW effort follows the industry standard Kimball dimensional model methodology for enterprise data warehousing and includes the following design principles and approaches:

Extraction

  • Source data will be incrementally extracted based upon the last update date for each respective source system.

Facts

  • All records within a fact must be the same grain.
  • Textual fact values may be moved into their own dimension for efficiency purposes.
  • Miscellaneous facts may be moved to a “junk”. dimension for efficiency purposes.
  • Business keys may remain within a respective fact table (i.e. degenerate dimension).
  • Surrogate keys will substitute elements identified as dimensional values. Surrogate keys will be non-intelligent integers.
  • Bridge tables will be used when multiple dimension values relate to one fact.
  • Fact elements will retain their source system precision.

Dimensions

  • Most dimensions are Type 2.
  • Type 2 and Type 1 changes may exist within a dimension.
  • Snowflaking will be minimized.

With regard to SSIS, it provides a robust visual designer for orchestrating the various tasks involved with extracting, transforming, and loading data. The following picture illustrates SSIS’s orchestration capabilities (in this case, a type 2 dimension processing):

type 2 dimension processing
Step Sequence

  • Step 1: Retrieve the inserted or changed dimension rows from the staging table, compare the CHECKSUMS, and set the appropriate indicator for each row: IsNewRow, IsType1Change, IsType2Change, IsInferredMember.
  • Step 2: Route Rows: Directs rows to the appropriate output based on the indicators for new rows, type 1 changes, type 2 changes and inferred members.
  • Step 3: Row Counts.
  • Step 4a: Update Type 1 Changes.
  • Step 4b: Update Type 2 Changes.
  • Step 4c: Update Inferred Rows.
  • Step 5: Insert New Dimension Rows (both Type 2 and New).