Software Technology FAQs and Books

Here you can find information on information technology, software techology related faqs and books, documentation

2.10.05

tuning mapping-3

Optimize Joiner Transformations
Joiner transformations can slow performance because they need additional space in memory at run time to hold intermediate results.
Define the rows from the smaller set of data in the joiner as the Master rows. The Master rows are cached to memory and the detail records are then compared to rows in the cache of the Master rows. In order to minimize memory requirements, the smaller set of data should be cached and thus set as Master.
Use Normal joins whenever possible. Normal joins are faster than outer joins and the resulting set of data is also smaller.
Use the database to do the join when sourcing data from the same database schema. Database systems usually can perform the join more quickly than the Informatica Server, so a SQL override or a join condition should be used when joining multiple tables from the same database schema.
Optimize Sequence Generator Transformations
Sequence Generator transformations need to determine the next available sequence number, thus increasing the Number of Cached Values property can increase performance. This property determines the number of values the Informatica Server caches at one time. If it is set to cache no values then the Informatica Server must query the Informatica repository each time to determine what is the next number which can be used. Configuring the Number of Cached Values to a value greater than 1000 should be considered. It should be noted any cached values not used in the course of a session are lost since the sequence generator value in the repository is set, when it is called next time, to give the next set of cache values.
Avoid External Procedure Transformations
For the most part, making calls to external procedures slows down a session. If possible, avoid the use of these Transformations, which include Stored Procedures, External Procedures and Advanced External Procedures.
Field Level Transformation Optimization
As a final step in the tuning process, expressions used in transformations can be tuned. When examining expressions, focus on complex expressions for possible simplification.
To help isolate slow expressions, do the following:
1. Time the session with the original expression.
2. Copy the mapping and replace half the complex expressions with a constant.
3. Run and time the edited session.
4. Make another copy of the mapping and replace the other half of the complex expressions with a constant.
5. Run and time the edited session.
Processing field level transformations takes time. If the transformation expressions are complex, then processing will be slower. Its often possible to get a 10- 20% performance improvement by optimizing complex field level transformations. Use the target table mapping reports or the Metadata Reporter to examine the transformations. Likely candidates for optimization are the fields with the most complex expressions. Keep in mind that there may be more than one field causing performance problems.
Factoring out Common Logic
This can reduce the number of times a mapping performs the same logic. If a mapping performs the same logic multiple times in a mapping, moving the task upstream in the mapping may allow the logic to be done just once. For example, a mapping has five target tables. Each target requires a Social Security Number lookup. Instead of performing the lookup right before each target, move the lookup to a position before the data flow splits.
Minimize Function Calls
Anytime a function is called it takes resources to process. There are several common examples where function calls can be reduced or eliminated.
Aggregate function calls can sometime be reduced. In the case of each aggregate function call, the Informatica Server must search and group the data.
Thus the following expression:
SUM(Column A) + SUM(Column B)
Can be optimized to:
SUM(Column A + Column B)
In general, operators are faster than functions, so operators should be used whenever possible.
For example if you have an expression which involves a CONCAT function such as:
CONCAT(CONCAT(FIRST_NAME, ), LAST_NAME)
It can be optimized to:
FIRST_NAME || || LAST_NAME
Remember that IIF() is a function that returns a value, not just a logical test. This allows many logical statements to be written in a more compact fashion.
For example:
IIF(FLG_A=Y and FLG_B=Y and FLG_C=Y, VAL_A+VAL_B+VAL_C,
IIF(FLG_A=Y and FLG_B=Y and FLG_C=N, VAL_A+VAL_B,
IIF(FLG_A=Y and FLG_B=N and FLG_C=Y, VAL_A+VAL_C,
IIF(FLG_A=Y and FLG_B=N and FLG_C=N, VAL_A,
IIF(FLG_A=N and FLG_B=Y and FLG_C=Y, VAL_B+VAL_C,
IIF(FLG_A=N and FLG_B=Y and FLG_C=N, VAL_B,
IIF(FLG_A=N and FLG_B=N and FLG_C=Y, VAL_C,
IIF(FLG_A=N and FLG_B=N and FLG_C=N, 0.0))))))))
Can be optimized to:
IIF(FLG_A=Y, VAL_A, 0.0) + IIF(FLG_B=Y, VAL_B, 0.0) + IIF(FLG_C=Y, VAL_C, 0.0)
The original expression had 8 IIFs, 16 ANDs and 24 comparisons. The optimized expression results in 3 IIFs, 3 comparisons and two additions.
Be creative in making expressions more efficient. The following is an example of rework of an expression which eliminates three comparisons down to one:
For example:
IIF(X=1 OR X=5 OR X=9, 'yes', 'no')
Can be optimized to:
IIF(MOD(X, 4) = 1, 'yes', 'no')
Calculate Once, Use Many Times
Avoid calculating or testing the same value multiple times. If the same sub-expression is used several times in a transformation, consider making the sub-expression a local variable. The local variable can be used only within the transformation but by calculating the variable only once can speed performance.
Choose Numeric versus String Operations
The Informatica Server processes numeric operations faster than string operations. For example, if a lookup is done on a large amount of data on two columns, EMPLOYEE_NAME and EMPLOYEE_ID, configuring the lookup around EMPLOYEE_ID improves performance.
Optimizing Char-Char and Char-Varchar Comparisons
When the Informatica Server performs comparisons between CHAR and VARCHAR columns, it slows each time it finds trailing blank spaces in the row. The Treat CHAR as CHAR On Read option can be set in the Informatica Server setup so that the Informatica Server does not trim trailing spaces from the end of CHAR source fields.
Use DECODE instead of LOOKUP
When a LOOKUP function is used, the Informatica Server must lookup a table in the database. When a DECODE function is used, the lookup values are incorporated into the expression itself so the Informatica Server does not need to lookup a separate table. Thus, when looking up a small set of unchanging values, using DECODE may improve performance.
Reduce the Number of Transformations in a Mapping
Whenever possible the number of transformations should be reduced. As there is always overhead involved in moving data between transformations. Along the same lines, unnecessary links between transformations should be removed to minimize the amount of data moved. This is especially important with data being pulled from the Source Qualifier Transformation.
Use Pre- and Post-Session SQL Commands
You can specify pre- and post-session SQL commands in the Properties tab of the Source Qualifier transformation and in the Properties tab of the target instance in a mapping. To increase the load speed, use these commands to drop indexes on the target before the session runs, then recreate them when the session completes.
Apply the following guidelines when using the SQL statements:
• You can use any command that is valid for the database type. However, the PowerCenter Server does not allow nested comments, even though the database might.
• You can use mapping parameters and variables in SQL executed against the source, but not against the target.
• Use a semi-colon (;) to separate multiple statements.
• The PowerCenter Server ignores semi-colons within single quotes, double quotes, or within /* ...*/.
• If you need to use a semi-colon outside of quotes or comments, you can escape it with a back slash (\).
• The Workflow Manager does not validate the SQL.
Use Environmental SQL
For relational databases, you can execute SQL commands in the database environment when connecting to the database. You can use this for source, target, lookup, and stored procedure connection. For instance, you can set isolation levels on the source and target systems to avoid deadlocks. Follow the guidelines mentioned above for using the SQL statements.