ETL Informatica and ETL Teamworker have in common

Sometimes it's not easy to persuade recruiter you are the best candidate and possess all the main skills required by customer. Especially when he doesn't know too much about your field. So your guess is right. The answer to the title question is simple. They have recruiters in common!

The call I had recently confirmed once again that some agents have no idea about the skills they are selling to their clients. Now I understand why we sometimes got people with very different skill set than we demanded.

But back to the interview. The agent was polite and talkative (no surprise) and we spent quite a while talking about the potential project. One of the nice-haves was ETL Informatica. I tried to explain her several times that although I don't have any real hands-on experience with Informatica, I passed several trainings on it and had some time to play with it. Moreover, I have experience with other ETL tools. But she somehow didn't get it. First I thought it was because of my English. As I'm not a native speaker I sometimes have very creative wording. But then she asked me the question that explained everything: "So you have never done ETL Informatica?" The lady didn't differentiated between ETL and Informatica! She thought ETL Informatica is some kind of process or methodology that is to be used at her client.

I explained her that ETL stands for Extract, Transform and Load and Informatica is vendor who sells software for ETL. I compared it to car manufactures. You have Ford and Mercedes. Both make cars. In general, the cars are same. If you can drive the car, you can drive Ford and Mercedes. (I apologise for simplification. I know there is a difference between Ford and Mercedes.) Similar situation is in ETL software market. Informatica is one ETL software vendor. Then there is IBM with DataStage, Oracle with Warehouse Builder and Data Integrator (this probably slightly confused her) and several others. If I have experience with three or more ETL tools the chance I would understand other ETL tool is very high. The lady seemed satisfied so hopefully she got the main difference between ETL and Informatica.

Informatica 7.1:Points to ponder with code pages

The problem:

The source data is 88591 we use ETL informatica it is having 11 charcter set of data since 88591 is 1 byte of data it gives samecode for different charcterset for example it is having x(code)=man(japan),women(korean) it raises ambiguity in database for japan and korean language , and now at present target is also 88591 the same "x" is sent to target but here there own languge is set in their system , if koren guy sees it he will undersatnd it is women and if japan guy sees he will understand as man but now we are going to make target database into utf-8 and informatica run in unicode mode here the source system is in 88591 as i have told u earlier it may generate same code for some charcterset now when we load it into target which is utf-8 here it generates unique code for different charcterset but we need to identify the end user reqirement and give him yhe exact data.

If end user is korean in earlier case it is x but now utf-8 generates unique code so we need to tell to informatca before loading in to target there it supports all charcaterset and give unique code for each charcter set.

My intention:-
We will be deciding at the time of running sessions or one time conversion yo flat file to utf-8 and then to target.

Although I know problem may seem hazy.Lets make it a lil bit clear before putting the solution.

A database named ABCD is defined to only support one character set(ISO-8859-1), data is getting populated here with data from multiple character sets like sjis,big5, GB2312 etc. We accept that the ordering of the data is according to ISO-8859/1

Slowly as time passes by ABCD will have text data in multiple different languages in multiple different character sets and later it becomes tough for identifying which language and character set the text belongs to. The UTF8 encoding of UNICODE, which keeps any current text in USASCII unchanged (the vast majority of our text data), but stores data from other character sets in 2, 3, or 4 byte units.

Now there is a requiremwnt to transfer data from ABCD toanother database named EFGH which is in UNICODE.So we need to be able to identify the character set of every text string.Lets assume we have identified that also.

Question is that how to perform that data transfer through INFA7.1

Thats can be done by INFA.Just keep following things in mind.

Check what is the type of your source database character set ( select * from nls_database_parameters ) 2/ Check what is the type of your target database character set( select * from nls_database_parameters ) 3/ Check what data movement has been set for Informatica Server which you are to assign in your workflow.
( Go to the config file you use to pass while starting informatica server in UNIX )

# Determine one of the two server data movement modes: UNICODE or ASCII.
# If not specified, ASCII data movement mode is assumed.
# ASCII:-PowerServer processes single byte character and does not perform codepage #conversion
#UNICODE:-Processes 2 bytes for a character.Enforce codepage validation

Set it Unicode,only then the end users will have full data else while there will be corrupt data.
If you are resetting,after resetting restart the Informatica Server service.
If you have set all those things right,then there is nothing to worry.Users should must see Data as per their locale.
You may face some LM_ error while loading data through INFA.In that case revert me back with error log portion like
MAPPING> CMN_1569 Server Mode: [UNICODE] CMN_1570 Server Codepage: [ISO
MAPPING> 8859-1 Western European]
If needed then disable codepage validation.

