Tuesday, March 16, 2010

Company Acquisition and Data Integration / Migration for EDW

Scenario

Company A purchases Company B. Company A needs to integrate Company B's operational systems into its Enterprise Data Warehouse and possibly many of A's data marts Company B's EDW will be retired. Company A and B have different application systems that do the same jobs: General Ledger, Order Management, Human Resources, etc.

Enterprise Data Warehouse

This blog will focus on the Enterprise Data Warehouse (EDW). There will be many more data stores that will need to be considered, but we'll start with this.

Planning

What needs to be planned for? What order do tasks need to be done in? What risk factors need to be considered?

  1. Understand Company A's source data architecture and metadata.
  2. Understand Company B's source data architecture and metadata.
  3. Company B may use a different database system (application software and database management system) than Company B does for the same business function. All businesses need to track their financial picture. Most will have some General Ledger (GL) application to do that. If A and B do not use the same GL application, we need to understand how B's system works and stores data and how that relates to A's system. We need to understand the metadata.

    To do this, we will need to work with Company B's subject matter experts (SME). We will need B's source system's entity relationship diagram(s) (ERD) and data dictionaries. We also need to know what data items are brought into the acquisition's EDW (assuming they have one). Just because they bring something into their EDW doesn't tell us why they bring it in. So we will need to understand their business requirements for bringing data into their EDW.

    So, the list of documents / information we have to get for the source system is:

    1. Database ERD (Physical Data Model (PDM))
    2. Data dictionary
    3. Source-to-target mappings from source database to their EDW
    4. Business rules for data brought to EDW. Why does the business need it?

  4. What data from A's source goes into A's EDW
    1. Why?
    2. How used?
    3. Transformations?
  5. What data from B's source goes into B's EDW
    1. Why?
    2. How used?
    3. Transformations?
  6. Gap analysis
    1. What data from A's source to EDW is missing from B's source to EDW?
    2. What data from B's source to EDW is missing from A's source to EDW
    3. Why does B need "extra" data?
    4. Why doesn't B use A's "extra" data?
    5. Why doesn't A use B's "extra" data?
  7. Data profiling
    1. Which tables and columns in B's source will be coming to A's EDW?
    2. Which tables and columns in A's source hold the same data as in B's source?
    3. Which column's in B's source have different data types, lengths, values, etc. than found in A's source.
  8. Data mart analysis (assume data marts are sourced from EDW)
    1. What data marts does B have that A doesn't have?
    2. What data marts does A have that B doesn't have?
    3. Will existing data marts in A support B's needs?
    4. What new data marts will A need to support B?
  9. Report analysis

Expand on these or add more.

No comments:

Post a Comment