Friday, September 6, 2013

Informatica User Management using LDAP


LDAP Capabilities
·         LDAP: Lightweight Directory Access Protocol
·         Repository Server passes a user login to the external directory for authentication
·         The repository maintains an association between repository user names and external login names
·         Does not maintain user passwords in the repository
·         Helps in security & user management

Implementation

·         Built in plug-in in Informatica software
·         Set up a connection to an LDAP directory service and specify the users and groups that can have access to the PowerCenter domain
·         Import the user account information from the LDAP directory service into an LDAP security domain
·         Set a filter to specify the user accounts to be included in an LDAP security domain.
·         When a user logs in, the Service Manager authenticates the user name and password against the LDAP directory service.

Set up the connection to the LDAP server
  1. In the LDAP Configuration dialog box, click the LDAP Connectivity tab.
  2. Configure the LDAP server properties.



Configure Security Domains
Create New Security group and provide attributes for below options
·         User Search Base
·         User Filter
·         Group Search Base
·         Group Filter

Schedule the Synchronization Times
1.       On the LDAP Configuration dialog box, click the Schedule tab.
2.       Click the Add button (+) to add a new time.
3.       The synchronization schedule uses a 24-hour time format.
4.       You can add as many synchronization times in the day as you require. If the list of users and groups in the LDAP directory service changes often, you can schedule the Service Manager to synchronize several times a day.
5.       To immediately synchronize the users and groups in the security domains with the users and groups in the LDAP directory service, click Synchronize Now.
6.       Click OK to save the synchronization schedule.

Operational Steps-Adding user and configuring
1.       Sync LDAP using Admin console
2.       Grant permissions
3.       Create Informatica Folder if required
4.       Add user as owner of Informatica Folder
5.       Add group (from native security domain) and give RWX
6.       If user will be removed anytime, folder owner will be automatically set to Administrator. Other users can still see folder as group permission has given.

How to Encrypt Password in SQLPLUS


As you know, sometimes it’s needed to do some oracle operations via UNIX script pre/post Informatica load operations. At the same time, security is important too. If you just do ‘sqlplus user/password@database’, it’s a major security flaw. Anyone can see password in script. Also run ‘ps’ command on UNIX linux and find out database credentials.
In order to make sure password is encrypted, below is the right way to do this:
Step 1: Create hidden encrypted password file  (assuming password is ‘fx=120’)
echo fx=120 | crypt `hostname` > .encry.connect
Step 2: Use in script as below (sample script)
export ORACLE_SID=DATABASE NAME
export ORACLE_USER=USER
export ORACLE_PW=$(crypt `hostname` < encry.connect)
sqlplus -S /nolog <<-SQLCMDS
connect ${ORACLE_USER}/${ORACLE_PW}@${ORACLE_SID}
select * from dual ; <-- Give commands here
exit
SQLCMDS

Informatica - An Overview


Informatica:

Informatica is used as an ETL tool for extracting the source data and loading it into the target after applying the necessary transformations. It has a simple visual interface like forms in visual basic. Most of the work in Informatica includes just drag and drop with the mouse in the Designer.

It is self-documenting in the sense that it follows a graphical approach. Almost all the major databases can communicate with Informatica to transform data of any size. Informatica comes in two forms:

Informatica can do joins on tables of different databases on different servers, using Informatica Server and Server Manager is the client application which interacts with the server.

We need to design process flows for data extraction, transformation and load. These flows are known as mappings and they can be kept and scheduled to run anytime in the future as per the requirements.

 


 
ETL stands for:

         E-extract

        T-transform

        L- load

These are the three database functions that are combined into one tool to pull data out of one database and place it into another database.

  • Extract -- the process of reading data from a database.
  • Transform -- the process of converting the extracted data from its previous form into the form it needs to be in so that it can be placed into another database. Transformation occurs by using rules or lookup tables or by combining the data with other data.
  • Load -- the process of writing the data into the target database.

ETL is used to migrate data from one database to another, to form data marts and data warehouses and also to convert databases from one format or type to another.

 

Informatica tool has 3 basic components:

 

  1. Informatica PowerCenter Client Tools:

                 These are the development tools that are installed at the developer end. These are:

    • Used to design processes and is a client application (Designer)
    • Define run-time properties for a mapping, known as sessions (Workflow Manager)
    • Monitor execution of sessions (Workflow Monitor)
    • Used to work with the repository (Repository Manager)
    • Report Metadata (Metadata Reporter)

 

  1. Informatica PowerCenter Repository:

 

 The Repository contains all the metadata for the application. It actually is the database of Informatica and contains all the data and mappings etc.

 

 

  1. Informatica PowerCenter Server:

 

All the executions take place at the Server. It connects to the source, extracts the data, performs transformations and mappings and then loads the data to the target system.

I. Designer:

It consists of several tools:

Source Analyzer - (choose Sources menu) to import or create source definitions for flat file, COBOL, ERP, and Relational Databases. Note - double-click on the title-bar opens a pop-up to edit definitions.

  • Warehouse Designer - to import or create target definitions (choose Targets - Generate/Execute SQL, or Targets-Create to create manually).
  • Transformation Developer - to create reusable transformations.
  • Mapplet Designer - to create mapplets
  • Mapping Designer - to create mappings.

II. Server Manager:

Server Manager is a client application used to create and manage sessions and batches, and to configure session connections. You can monitor multiple Informatica Servers, navigate through folders and repositories. Here is what you can do in Server Manager:

  • Monitor, add, edit, delete Informatica Server info in the repository
  • Stop the Informatica Server
  • Configure database, external loader, and FTP connections
  • Manage sessions and batches - create, edit, delete, copy/move within a folder, start/stop, abort sessions, view session logs, details, and session performance details.


To make a mapping:

  • Switch to the Mapping Designer
  • Choose Mapping-Create - and enter a new name
  • Drag a source table from the navigator to the work space. Note, that the designer will also automatically create and show a Source Qualifier transformation.
  • Drag a target table to the work space
  • Drag one-by-one fields from source to target - thus creating graphical connections. Note - you can also delete connection by selecting it - and pressing DEL button.
  • Choose Layout-Arrange

 

Let’s see the explanation some of the major components:

 

  • Repository Manager: It has the following functions-

       Manage repository:

           Connections

           Folders

           Objects

           Users and groups

 

  • Repository: This is where all the metadata information is stored in the Informatica suite. The Power Center Client and the Repository Server would access this repository to retrieve, store and manage metadata.
  • Power Center Client: Informatica client is used for managing users, identifying source and target systems definitions, creating mapping and mapplets, creating sessions and run workflows etc.
  • Repository Server: This repository server takes care of all the connections between the repository and the Power Center Client.
  • Power Center Server: Power Center server does the extraction from source and then loading data into targets.
  • Designer: Source Analyzer, Mapping Designer and Warehouse Designer are tools reside within the Designer wizard. Source Analyzer is used for extracting metadata from source systems.
    Mapping Designer is used to create mapping between sources and targets. Mapping is a pictorial representation about the flow of data from source to target.
    Warehouse Designer is used for extracting metadata from target systems or metadata can be created in the Designer itself.
  • Data Cleansing: The Power Center's data cleansing technology improves data quality by validating, correctly naming and standardization of address data. A person's address may not be same in all source systems because of typos and postal code, city name may not match with address. These errors can be corrected by using data cleansing process and standardized data can be loaded in target systems (data warehouse).
  • Transformation: Transformations help to transform the source data according to the requirements of target system. Sorting, Filtering, Aggregation, Joining are some of the examples of transformation. Transformations ensure the quality of the data being loaded into target and this is done during the mapping process from source to target.
  • Workflow Manager: Workflow helps to load the data from source to target in a sequential manner. For example, if the fact tables are loaded before the lookup tables, then the target system will pop up an error message since the fact table is violating the foreign key validation. To avoid this, workflows can be created to ensure the correct flow of data from source to target.
  • Workflow Monitor: This monitor is helpful in monitoring and tracking the workflows created in each Power Center Server.
  • Power Center Connect: This component helps to extract data and metadata from ERP systems like IBM's MQSeries, PeopleSoft, SAP, Siebel etc. and other third party applications.
  • Power Center Exchange: This component helps to extract data and metadata from ERP systems like IBM's MQSeries, PeopleSoft, SAP, Siebel etc. and other third party applications.
  • Rep Server Administration Console: It is used to-

             Administer repositories on a Repository Server:

           Create/upgrade/delete

           Configuration

           Start/stop

           Backup/restore

 

1.    Code page: Informatica has a code page that contains encoding used to specify characters in a set of one or more languages. The code page is selected based on source of the data. For example if source contains Chinese text then the code page should be selected to support Chinese text. This influences the way that application stores, receives, and sends character data.

2.     Power Center Server on Windows can connect to following databases:

  • IBM DB2
  • Informix
  • Microsoft Access
  • Microsoft Excel
  • Microsoft SQL Server
  • Oracle
  • Sybase
  • Teradata
Mapplet: A mapplet is a reusable object that is created using mapplet designer. The mapplet contains set of transformations and it allows us to reuse that transformation logic in multiple mappings.