informatica faq part3
Q: Assuming a workflow failure, does PowerCenter allow restart from the point of failure?
No. When a workflow fails, you can choose to start a workflow from a particular task but not from the point of failure. It is possible, however, to create tasks and flows based on error handling assumptions.
Q: What guidelines exist regarding the execution of multiple concurrent sessions / workflows within or across applications?
Workflow Execution needs to be planned around two main constraints:
• Available system resources
• Memory and processors
The number of sessions that can run at one time depends on the number of processors available on the server. The load manager is always running as a process. As a general rule, a session will be compute-bound, meaning its throughput is limited by the availability of CPU cycles. Most sessions are transformation intensive, so the DTM always runs. Also, some sessions require more I/O, so they use less processor time. Generally, a session needs about 120 percent of a processor for the DTM, reader, and writer in total.
For concurrent sessions:
• One session per processor is about right; you can run more, but that requires a "trial and error" approach to determine what number of sessions starts to affect session performance and possibly adversely affect other executing tasks on the server.
The sessions should run at "off-peak" hours to have as many available resources as possible.
Even after available processors are determined, it is necessary to look at overall system resource usage. Determining memory usage is more difficult than the processors calculation; it tends to vary according to system load and number of Informatica sessions running.
The first step is to estimate memory usage, accounting for:
• Operating system kernel and miscellaneous processes
• Database engine
• Informatica Load Manager
The DTM process creates threads to initialize the session, read, write and transform data, and handle pre- and post-session operations.
• More memory is allocated for lookups, aggregates, ranks, sorters and heterogeneous joins in addition to the shared memory segment.
At this point, you should have a good idea of what is left for concurrent sessions. It is important to arrange the production run to maximize use of this memory. Remember to account for sessions with large memory requirements; you may be able to run only one large session, or several small sessions concurrently.
Load Order Dependencies are also an important consideration because they often create additional constraints. For example, load the dimensions first, then facts. Also, some sources may only be available at specific times, some network links may become saturated if overloaded, and some target tables may need to be available to end users earlier than others.
Q: Is it possible to perform two "levels" of event notification? At the application level and the Informatica Server level to notify the Server Administrator?
The application level of event notification can be accomplished through post-session e-mail. Post-session e-mail allows you to create two different messages, one to be sent upon successful completion of the session, the other to be sent if the session fails. Messages can be a simple notification of session completion or failure, or a more complex notification containing specifics about the session. You can use the following variables in the text of your post-session e-mail:
E-mail Variable Description
%s Session name
%l Total records loaded
%r Total records rejected
%e Session status
%t Table details, including read throughput in bytes/second and write throughput in rows/second
%b Session start time
%c Session completion time
%i Session elapsed time (session completion time-session start time)
%g Attaches the session log to the message
%m Name and version of the mapping used in the session
%d Name of the folder containing the session
%n Name of the repository containing the session
%a Attaches the named file. The file must be local to the Informatica Server. The following are valid filenames: %a or %a
On Windows NT, you can attach a file of any type.
On UNIX, you can only attach text files. If you attach a non-text file, the send might fail.
Note: The filename cannot include the Greater Than character (>) or a line break.
The PowerCenter Server on UNIX uses rmail to send post-session e-mail. The repository user who starts the PowerCenter server must have the rmail tool installed in the path in order to send e-mail.
To verify the rmail tool is accessible:
1. Login to the UNIX system as the PowerCenter user who starts the PowerCenter Server.
2. Type rmail at the prompt and press Enter.
3. Type '.' to indicate the end of the message and press Enter.
4. You should receive a blank e-mail from the PowerCenter user's e-mail account. If not, locate the directory where rmail resides and add that directory to the path.
5. When you have verified that rmail is installed correctly, you are ready to send post-session e-mail.
The output should look like the following:
Session complete.
Session name: sInstrTest
Total Rows Loaded = 1
Total Rows Rejected = 0
Completed
Rows
Loaded Rows
Rejected ReadThroughput
(bytes/sec) WriteThroughput
(rows/sec) Table Name
Status
1 0 30 1 t_Q3_sales
No errors encountered.
Start Time: Tue Sep 14 12:26:31 1999
Completion Time: Tue Sep 14 12:26:41 1999
Elapsed time: 0:00:10 (h:m:s)
This information, or a subset, can also be sent to any text pager that accepts e-mail.
Backup Strategy Recommendation
No. When a workflow fails, you can choose to start a workflow from a particular task but not from the point of failure. It is possible, however, to create tasks and flows based on error handling assumptions.
Q: What guidelines exist regarding the execution of multiple concurrent sessions / workflows within or across applications?
Workflow Execution needs to be planned around two main constraints:
• Available system resources
• Memory and processors
The number of sessions that can run at one time depends on the number of processors available on the server. The load manager is always running as a process. As a general rule, a session will be compute-bound, meaning its throughput is limited by the availability of CPU cycles. Most sessions are transformation intensive, so the DTM always runs. Also, some sessions require more I/O, so they use less processor time. Generally, a session needs about 120 percent of a processor for the DTM, reader, and writer in total.
For concurrent sessions:
• One session per processor is about right; you can run more, but that requires a "trial and error" approach to determine what number of sessions starts to affect session performance and possibly adversely affect other executing tasks on the server.
The sessions should run at "off-peak" hours to have as many available resources as possible.
Even after available processors are determined, it is necessary to look at overall system resource usage. Determining memory usage is more difficult than the processors calculation; it tends to vary according to system load and number of Informatica sessions running.
The first step is to estimate memory usage, accounting for:
• Operating system kernel and miscellaneous processes
• Database engine
• Informatica Load Manager
The DTM process creates threads to initialize the session, read, write and transform data, and handle pre- and post-session operations.
• More memory is allocated for lookups, aggregates, ranks, sorters and heterogeneous joins in addition to the shared memory segment.
At this point, you should have a good idea of what is left for concurrent sessions. It is important to arrange the production run to maximize use of this memory. Remember to account for sessions with large memory requirements; you may be able to run only one large session, or several small sessions concurrently.
Load Order Dependencies are also an important consideration because they often create additional constraints. For example, load the dimensions first, then facts. Also, some sources may only be available at specific times, some network links may become saturated if overloaded, and some target tables may need to be available to end users earlier than others.
Q: Is it possible to perform two "levels" of event notification? At the application level and the Informatica Server level to notify the Server Administrator?
The application level of event notification can be accomplished through post-session e-mail. Post-session e-mail allows you to create two different messages, one to be sent upon successful completion of the session, the other to be sent if the session fails. Messages can be a simple notification of session completion or failure, or a more complex notification containing specifics about the session. You can use the following variables in the text of your post-session e-mail:
E-mail Variable Description
%s Session name
%l Total records loaded
%r Total records rejected
%e Session status
%t Table details, including read throughput in bytes/second and write throughput in rows/second
%b Session start time
%c Session completion time
%i Session elapsed time (session completion time-session start time)
%g Attaches the session log to the message
%m Name and version of the mapping used in the session
%d Name of the folder containing the session
%n Name of the repository containing the session
%a
On Windows NT, you can attach a file of any type.
On UNIX, you can only attach text files. If you attach a non-text file, the send might fail.
Note: The filename cannot include the Greater Than character (>) or a line break.
The PowerCenter Server on UNIX uses rmail to send post-session e-mail. The repository user who starts the PowerCenter server must have the rmail tool installed in the path in order to send e-mail.
To verify the rmail tool is accessible:
1. Login to the UNIX system as the PowerCenter user who starts the PowerCenter Server.
2. Type rmail
3. Type '.' to indicate the end of the message and press Enter.
4. You should receive a blank e-mail from the PowerCenter user's e-mail account. If not, locate the directory where rmail resides and add that directory to the path.
5. When you have verified that rmail is installed correctly, you are ready to send post-session e-mail.
The output should look like the following:
Session complete.
Session name: sInstrTest
Total Rows Loaded = 1
Total Rows Rejected = 0
Completed
Rows
Loaded Rows
Rejected ReadThroughput
(bytes/sec) WriteThroughput
(rows/sec) Table Name
Status
1 0 30 1 t_Q3_sales
No errors encountered.
Start Time: Tue Sep 14 12:26:31 1999
Completion Time: Tue Sep 14 12:26:41 1999
Elapsed time: 0:00:10 (h:m:s)
This information, or a subset, can also be sent to any text pager that accepts e-mail.
Backup Strategy Recommendation

0 Comments:
Post a Comment
<< Home