This blog is all about topics related to ETL , Data Warehousing,Informatica ,ETL Tools ,SAS and other topics which will be helpful to ETL developers.
Wednesday, 31 August 2011
Informatica Installation...
Informatica Power center 8.x Architecture...
Very good presentation on Informatica Powercenter 8.0....
Powercenter Informatica 8.0 (Datawarehousing and ETL Tool)
Powercenter Informatica 8.0 (Datawarehousing and ETL Tool)
Sunday, 28 August 2011
Creating User Defined Function in Informatica....
• In Informatica go to Tools ->User-defined Functions..
• Click on New… which is on right hand side of window.
• Fill the fields as shown below.
Public User Defined functions can be called from any transformation expression.
Private User Defined functions can be called only from other User defined functions.
• You can call this UDF using expression editor…
• Click on New… which is on right hand side of window.
• Fill the fields as shown below.
Public User Defined functions can be called from any transformation expression.
Private User Defined functions can be called only from other User defined functions.
• You can call this UDF using expression editor…
ETL Overview.......
ETL is very important process in Data Warehousing which involves following three major steps:-
Extraction :-
This is the first step in the ETL Process which involves extracting the data from the source systems. Source systems here refer to Relational Databases , Flat Files and many other systems like ERP, SAP, CRM etc. Thus extraction involves fetching the data from different sources and converting it to single consistent format so that it is ready for next stage which is transform.
Transform:-
This is second step in the ETL in which number of Business rules are applied to the extracted data so that data can be loaded in the required format in the target database. Transform is the most time consuming stage in ETL which requires a lot of processing and calculations need to be done. Some of the transformations normally done in transform stage are listed below:
• Joining data from multiple sources
• Aggregation
• Calculating new values from the source data values.
• Splitting a column in multiple columns
• Normalization and De- Normalization
• Cleaning the source data
• Translating source data as per business rules and requirements
Load:-
Load step takes the data from the transform step and loads it into the target which is mostly the Data Warehouse. Depending upon the business requirements this process may load overwrite the existing data or may maintain the historic data for analysis purpose. Load period also varies from daily ,weekly to monthly as per the requirements.
Subscribe to:
Posts (Atom)