In Informatica , many times we face scenario where we have to concatenate values from two or more rows into a single row.
Consider a source which contains the below records:
ID Name
1 A
2 R
3 B
1 D
2 E
2 O
Suppose we want output like this:
ID Name
1 D,A
2 O,E,R
3 B
Above scenario can be implemented in Informatica as depicted below:
Steps:
1) Create source definition which contains the columns named ID and NAME
2) After source qualifier place a sorter and sort the data in ascending order . Here sort key will be ID and sort order will be ascending order.
3) After sorter put an expression transformation . In expression crate following fields in addition to the fields coming from sorter
v_NAME(variable) IIF(ID=v_ID,NAME||','||v_NAME,NAME)
v_ID(variable):ID
out_NAME(output field)=v_NAME
4) After expression place an aggregator . In aggregator group by will be on ID.
5) Then take the ID and NAME field to the target.
This blog is all about topics related to ETL , Data Warehousing,Informatica ,ETL Tools ,SAS and other topics which will be helpful to ETL developers.
Sunday, 4 September 2011
Saturday, 3 September 2011
Wednesday, 31 August 2011
Informatica Installation...
Informatica Power center 8.x Architecture...
Very good presentation on Informatica Powercenter 8.0....
Powercenter Informatica 8.0 (Datawarehousing and ETL Tool)
Powercenter Informatica 8.0 (Datawarehousing and ETL Tool)
Sunday, 28 August 2011
Creating User Defined Function in Informatica....
• In Informatica go to Tools ->User-defined Functions..
• Click on New… which is on right hand side of window.
• Fill the fields as shown below.
Public User Defined functions can be called from any transformation expression.
Private User Defined functions can be called only from other User defined functions.
• You can call this UDF using expression editor…
• Click on New… which is on right hand side of window.
• Fill the fields as shown below.
Public User Defined functions can be called from any transformation expression.
Private User Defined functions can be called only from other User defined functions.
• You can call this UDF using expression editor…
ETL Overview.......
ETL is very important process in Data Warehousing which involves following three major steps:-
Extraction :-
This is the first step in the ETL Process which involves extracting the data from the source systems. Source systems here refer to Relational Databases , Flat Files and many other systems like ERP, SAP, CRM etc. Thus extraction involves fetching the data from different sources and converting it to single consistent format so that it is ready for next stage which is transform.
Transform:-
This is second step in the ETL in which number of Business rules are applied to the extracted data so that data can be loaded in the required format in the target database. Transform is the most time consuming stage in ETL which requires a lot of processing and calculations need to be done. Some of the transformations normally done in transform stage are listed below:
• Joining data from multiple sources
• Aggregation
• Calculating new values from the source data values.
• Splitting a column in multiple columns
• Normalization and De- Normalization
• Cleaning the source data
• Translating source data as per business rules and requirements
Load:-
Load step takes the data from the transform step and loads it into the target which is mostly the Data Warehouse. Depending upon the business requirements this process may load overwrite the existing data or may maintain the historic data for analysis purpose. Load period also varies from daily ,weekly to monthly as per the requirements.
Subscribe to:
Posts (Atom)