In data warehouse design, data profiling the data source is the first and the most important step to understand the data source. This post will show you how to get the SAP ERP metadata information and how to manually integrate this metadata information, such as Business Name, Key Flag, into the Informatica IDQ data profiling result.
Data Profiling with Informatica IDQ
Informatica Data Quality (IDQ) is a great tool you can use for the data profiling. After creating the connection to SAP ERP database (Oracle, SQl Server or others), you will able to import the SAP tables and create the Data Profiling for he tables that you are interested (see IDQ user guide for the detail). The default data profiling setting and the result will fulfill all or at least most of the data warehouse design needs. Below is an example of the data profiling result for the table T001 (Company).
You can export the the data profiling result into an excel file. The first worksheet of the excel file is named “Column Profile” and it looks like this:
Everything looks good there, except a big problem: it is not convenient to alignment up with the business names for the data analysis during the data warehouse design phrase. The SAP ERP physical table and column names is always a big headache for me to remember and mapping to their corresponding business names. We need to find a efficient way to get the SAP ERP tables and columns metadata and integrate such metadata information such as Business Name and Key Flag, into the the IDQ data profiling result. That will help us quickly understand the SAP source tables and make better decisions for data warehouse design.
SAP ERP Metadat
SAP ERP system stores the metadata information in several metadata tables. For the data profiling purpose data for warehouse design, the following metadata table will provide us all the information we need:
- DD02L — SAP Tables
- DD02T — SAP Table Texts
- DD03L — Table Fields
- DD03T — Texts for fields (language dependent)
- DD04L — Data elements
- DD04T — Data element texts
The following SQL queries will help us to find the table and column metadata information we need:
Table Matadata Query:
SELECT DD02L.TABNAME , DD02T.DDTEXT TAB_DESC
FROM DD02L, DD02T
WHERE DD02L.TABNAME = DD02T.TABNAME
AND DD02T.DDLANGUAGE = ‘E’
AND DD02L.TABNAME =’&&Table_Name’;
Column Matadata Query:
SELECT DD03L.TABNAME, DD02T.DDTEXT TABDESC, DD03L.FIELDNAME, DD04T.DDTEXT AS FIELDDESC, DD03L.KEYFLAG, DD03L.ROLLNAME, DD03L.POSITION, DD03L.CHECKTABLE , DD03L.NOTNULL, DD03L.DATATYPE, DD03L.LENG, DD03L.DECIMALS
FROM DD03L, DD02T, DD04T
WHERE DD04T.DDLANGUAGE = ‘E’
AND DD02T.DDLANGUAGE = ‘E’
AND DD03L.TABNAME = DD02T.TABNAME
AND DD03L.ROLLNAME = DD04T.ROLLNAME
AND DD03L.TABNAME = ‘&&Table_name’
ORDER BY DD03L.TABNAME, DD03L.POSITION;
Of course, you need to substitute the binding variable &&Table_name with the actual table name that you are interested.
Integrate the IDQ Data Profiling Result with the SAP Metadata
Since that we have the IDQ data profiling result (excel file) and the SAP metadata info from the SAP Column metadata query. We can manually integrate those two data sets together to provide a convenient way to analysis the SAP data source and make the design decision.
Metadata Integration Steps:
- Open the saved IDQ data profiling result excel file.
- Run the Column Matadata Query and add the query result data set (Export or Copy/Paste) into a new worksheet named “Metadata”.
- In the “Metadata” worksheet, select columns FIELDNAME, FIELDDESC, KEYFLAG, CHECKTABLE, and create a new range name called “COL_METADATA” using the Excel Name Manager (see below).
- Move to the first worksheet named “Column Profile”, add three (or more more if you want to add more metadata info) new columns in beginning (new columns becomes A, B and C). and add the column headers as “Business Name”, “Key Flag”, “Lookup Table” for the cells A8, B8 and C8, respectively.
- Enter the following excel formula in the cells A9, B9, C9, respectively.
- Verify they return the expected values. Then select the three cells A9, B9, C9 (and hold the mouse) and drag down to all the applicable rows. you will get the desired metadata information.