data warehpusing interview qa-2
10. What are the modules/tools in Business Objects? Explain theier purpose briefly?
Ans: BO Designer, Business Query for Excel, BO Reporter, Infoview,Explorer,WEBI, BO Publisher, and Broadcast Agent, BO
ZABO).
InfoView: IT portal entry into WebIntelligence & Business Objects.
Base module required for all options to view and refresh reports.
Reporter: Upgrade to create/modify reports on LAN or Web.
Explorer: Upgrade to perform OLAP processing on LAN or Web.
Designer: Creates semantic layer between user and database.
Supervisor: Administer and control access for group of users.
WebIntelligence: Integrated query, reporting, and OLAP analysis over the Web.
Broadcast Agent: Used to schedule, run, publish, push, and broadcast pre-built reports and spreadsheets, including event
notification and response capabilities, event filtering, and calendar based notification, over the LAN, e-
mail, pager,Fax, Personal Digital Assistant( PDA), Short Messaging Service(SMS), etc.
Set Analyzer - Applies set-based analysis to perform functions such as execlusion, intersections, unions, and overlaps
visually.
Developer Suite – Build packaged, analytical, or customized apps.
11.What are the Ad hoc quries, Canned Quries/Reports? and How do u create them?
(Plz check this page……C\:BObjects\Quries\Data Warehouse - About Queries.htm)
Ans: The data warehouse will contain two types of query. There will be fixed queries that are clearly defined and well understood, such as regular reports, canned queries (standard reports) and common aggregations. There will also be ad hoc queries that are unpredictable, both in quantity and frequency.
Ad Hoc Query: Ad hoc queries are the starting point for any analysis into a database. Any business analyst wants to know what is inside the database. He then proceeds by calculating totals, averages, maximum and minimum values for most attributes within the database. These are unpredictable element of a data warehouse. It is exactly that ability to run any query when desired and expect a reasonable response that makes the data warhouse worthwhile, and makes the design such a significant challenge.
The end-user access tools are capable of automatically generating the database query that answers any Question posed by the user. The user will typically pose questions in terms that they are familier with (for example, sales by store last week); this is converted into the database query by the access tool, which is aware of the structure of information within the data warehouse.
Canned queries: Canned queries are predefined queries. In most instances, canned queries contain prompts that allow you to customize the query for your specific needs. For example, a prompt may ask you for a School, department, term, or section ID. In this instance you would enter the name of the School, department or term, and the query will retrieve the specified data from the Warehouse.You can measure resource requirements of these queries, and the results can be used for capacity palnning and for database design.
The main reason for using a canned query or report rather than creating your own is that your chances of misinterpreting data or getting the wrong answer are reduced. You are assured of getting the right data and the right answer.
12. How many Fact tables and how many dimension tables u did? Which table precedes what?
Ans: http://www.ciobriefings.com/whitepapers/StarSchema.asp
13. What is the difference between STAR SCHEMA & SNOW FLAKE SCHEMA?
Ans: http://www.ciobriefings.com/whitepapers/StarSchema.asp
14. Why did u choose STAR SCHEMA only? What are the benefits of STAR SCHEMA?
Ans: Because it’s denormalized structure , i.e., Dimension Tables are denormalized. Why to denormalize means the first (and often
only) answer is : speed. OLTP structure is designed for data inserts, updates, and deletes, but not data retrieval. Therefore,
we can often squeeze some speed out of it by denormalizing some of the tables and having queries go against fewer tables.
These queries are faster because they perform fewer joins to retrieve the same recordset. Joins are also confusing to many
End users. By denormalizing, we can present the user with a view of the data that is far easier for them to understand.
Benefits of STAR SCHEMA:
• Far fewer Tables.
• Designed for analysis across time.
• Simplifies joins.
• Less database space.
• Supports “drilling” in reports.
• Flexibility to meet business and technical needs.
15. How do u load the data using Informatica?
Ans: Using session.
16. (i) What is FTP? (ii) How do u connect to remote? (iii) Is there another way to use FTP without a special utility?
Ans: (i): The FTP (File Transfer Protocol) utility program is commonly used for copying files to and from other computers. These
computers may be at the same site or at different sites thousands of miles apart. FTP is general protocol that works on UNIX
systems as well as other non- UNIX systems.
(ii): Remote connect commands:
ftp machinename
ex: ftp 129.82.45.181 or ftp iesg
If the remote machine has been reached successfully, FTP responds by asking for a loginname and password. When u enter
ur own loginname and password for the remote machine, it returns the prompt like below
ftp>
and permits u access to ur own home directory on the remote machine. U should be able to move around in ur own directory
and to copy files to and from ur local machine using the FTP interface commands.
Note: U can set the mode of file transfer to ASCII ( default and transmits seven bits per character).
Use the ASCII mode with any of the following:
- Raw Data (e.g. *.dat or *.txt, codebooks, or other plain text documents)
- SPSS Portable files.
- HTML files.
If u set mode of file transfer to Binary (the binary mode transmits all eight bits per byte and thus provides less chance of
a transmission error and must be used to transmit files other than ASCII files).
For example use binary mode for the following types of files:
- SPSS System files
- SAS Dataset
- Graphic files (eg., *.gif, *.jpg, *.bmp, etc.)
- Microsoft Office documents (*.doc, *.xls, etc.)
(iii): Yes. If u r using Windows, u can access a text-based FTP utility from a DOS prompt.
To do this, perform the following steps:
1. From the Start à Programs àMS-Dos Prompt
2. Enter “ftp ftp.geocities.com.” A prompt will appear
(or)
Enter ftp to get ftp prompt à ftp> àopen hostname ex. ftp>open ftp.geocities.com (It connect to the specified host).
3. Enter ur yahoo! GeoCities member name.
4. enter your yahoo! GeoCities pwd.
You can now use standard FTP commands to manage the files in your Yahoo! GeoCities directory.
17.What cmd is used to transfer multiple files at a time using FTP?
Ans: mget ==> To copy multiple files from the remote machine to the local machine. You will be prompted for a y/n answer before
transferring each file mget * ( copies all files in the current remote directory to ur current local directory,
using the same file names).
mput ==> To copy multiple files from the local machine to the remote machine.
18. What is an Filter Transformation? or what options u have in Filter Transformation?
Ans: The Filter transformation provides the means for filtering records in a mapping. You pass all the rows from a source
transformation through the Filter transformation, then enter a filter condition for the transformation. All ports in a Filter
transformation are input/output, and only records that meet the condition pass through the Filter transformation.
Note: Discarded rows do not appear in the session log or reject files
To maximize session performance, include the Filter transformation as close to the sources in the mapping as possible.
Rather than passing records you plan to discard through the mapping, you then filter out unwanted data early in the
flow of data from sources to targets.
You cannot concatenate ports from more than one transformation into the Filter transformation; the input ports for the filter
must come from a single transformation. Filter transformations exist within the flow of the mapping and cannot be
unconnected. The Filter transformation does not allow setting output default values.
19.What are default sources which will supported by Informatica Powermart ?
Ans :
• Relational tables, views, and synonyms.
• Fixed-width and delimited flat files that do not contain binary data.
• COBOL files.
Ans: BO Designer, Business Query for Excel, BO Reporter, Infoview,Explorer,WEBI, BO Publisher, and Broadcast Agent, BO
ZABO).
InfoView: IT portal entry into WebIntelligence & Business Objects.
Base module required for all options to view and refresh reports.
Reporter: Upgrade to create/modify reports on LAN or Web.
Explorer: Upgrade to perform OLAP processing on LAN or Web.
Designer: Creates semantic layer between user and database.
Supervisor: Administer and control access for group of users.
WebIntelligence: Integrated query, reporting, and OLAP analysis over the Web.
Broadcast Agent: Used to schedule, run, publish, push, and broadcast pre-built reports and spreadsheets, including event
notification and response capabilities, event filtering, and calendar based notification, over the LAN, e-
mail, pager,Fax, Personal Digital Assistant( PDA), Short Messaging Service(SMS), etc.
Set Analyzer - Applies set-based analysis to perform functions such as execlusion, intersections, unions, and overlaps
visually.
Developer Suite – Build packaged, analytical, or customized apps.
11.What are the Ad hoc quries, Canned Quries/Reports? and How do u create them?
(Plz check this page……C\:BObjects\Quries\Data Warehouse - About Queries.htm)
Ans: The data warehouse will contain two types of query. There will be fixed queries that are clearly defined and well understood, such as regular reports, canned queries (standard reports) and common aggregations. There will also be ad hoc queries that are unpredictable, both in quantity and frequency.
Ad Hoc Query: Ad hoc queries are the starting point for any analysis into a database. Any business analyst wants to know what is inside the database. He then proceeds by calculating totals, averages, maximum and minimum values for most attributes within the database. These are unpredictable element of a data warehouse. It is exactly that ability to run any query when desired and expect a reasonable response that makes the data warhouse worthwhile, and makes the design such a significant challenge.
The end-user access tools are capable of automatically generating the database query that answers any Question posed by the user. The user will typically pose questions in terms that they are familier with (for example, sales by store last week); this is converted into the database query by the access tool, which is aware of the structure of information within the data warehouse.
Canned queries: Canned queries are predefined queries. In most instances, canned queries contain prompts that allow you to customize the query for your specific needs. For example, a prompt may ask you for a School, department, term, or section ID. In this instance you would enter the name of the School, department or term, and the query will retrieve the specified data from the Warehouse.You can measure resource requirements of these queries, and the results can be used for capacity palnning and for database design.
The main reason for using a canned query or report rather than creating your own is that your chances of misinterpreting data or getting the wrong answer are reduced. You are assured of getting the right data and the right answer.
12. How many Fact tables and how many dimension tables u did? Which table precedes what?
Ans: http://www.ciobriefings.com/whitepapers/StarSchema.asp
13. What is the difference between STAR SCHEMA & SNOW FLAKE SCHEMA?
Ans: http://www.ciobriefings.com/whitepapers/StarSchema.asp
14. Why did u choose STAR SCHEMA only? What are the benefits of STAR SCHEMA?
Ans: Because it’s denormalized structure , i.e., Dimension Tables are denormalized. Why to denormalize means the first (and often
only) answer is : speed. OLTP structure is designed for data inserts, updates, and deletes, but not data retrieval. Therefore,
we can often squeeze some speed out of it by denormalizing some of the tables and having queries go against fewer tables.
These queries are faster because they perform fewer joins to retrieve the same recordset. Joins are also confusing to many
End users. By denormalizing, we can present the user with a view of the data that is far easier for them to understand.
Benefits of STAR SCHEMA:
• Far fewer Tables.
• Designed for analysis across time.
• Simplifies joins.
• Less database space.
• Supports “drilling” in reports.
• Flexibility to meet business and technical needs.
15. How do u load the data using Informatica?
Ans: Using session.
16. (i) What is FTP? (ii) How do u connect to remote? (iii) Is there another way to use FTP without a special utility?
Ans: (i): The FTP (File Transfer Protocol) utility program is commonly used for copying files to and from other computers. These
computers may be at the same site or at different sites thousands of miles apart. FTP is general protocol that works on UNIX
systems as well as other non- UNIX systems.
(ii): Remote connect commands:
ftp machinename
ex: ftp 129.82.45.181 or ftp iesg
If the remote machine has been reached successfully, FTP responds by asking for a loginname and password. When u enter
ur own loginname and password for the remote machine, it returns the prompt like below
ftp>
and permits u access to ur own home directory on the remote machine. U should be able to move around in ur own directory
and to copy files to and from ur local machine using the FTP interface commands.
Note: U can set the mode of file transfer to ASCII ( default and transmits seven bits per character).
Use the ASCII mode with any of the following:
- Raw Data (e.g. *.dat or *.txt, codebooks, or other plain text documents)
- SPSS Portable files.
- HTML files.
If u set mode of file transfer to Binary (the binary mode transmits all eight bits per byte and thus provides less chance of
a transmission error and must be used to transmit files other than ASCII files).
For example use binary mode for the following types of files:
- SPSS System files
- SAS Dataset
- Graphic files (eg., *.gif, *.jpg, *.bmp, etc.)
- Microsoft Office documents (*.doc, *.xls, etc.)
(iii): Yes. If u r using Windows, u can access a text-based FTP utility from a DOS prompt.
To do this, perform the following steps:
1. From the Start à Programs àMS-Dos Prompt
2. Enter “ftp ftp.geocities.com.” A prompt will appear
(or)
Enter ftp to get ftp prompt à ftp> àopen hostname ex. ftp>open ftp.geocities.com (It connect to the specified host).
3. Enter ur yahoo! GeoCities member name.
4. enter your yahoo! GeoCities pwd.
You can now use standard FTP commands to manage the files in your Yahoo! GeoCities directory.
17.What cmd is used to transfer multiple files at a time using FTP?
Ans: mget ==> To copy multiple files from the remote machine to the local machine. You will be prompted for a y/n answer before
transferring each file mget * ( copies all files in the current remote directory to ur current local directory,
using the same file names).
mput ==> To copy multiple files from the local machine to the remote machine.
18. What is an Filter Transformation? or what options u have in Filter Transformation?
Ans: The Filter transformation provides the means for filtering records in a mapping. You pass all the rows from a source
transformation through the Filter transformation, then enter a filter condition for the transformation. All ports in a Filter
transformation are input/output, and only records that meet the condition pass through the Filter transformation.
Note: Discarded rows do not appear in the session log or reject files
To maximize session performance, include the Filter transformation as close to the sources in the mapping as possible.
Rather than passing records you plan to discard through the mapping, you then filter out unwanted data early in the
flow of data from sources to targets.
You cannot concatenate ports from more than one transformation into the Filter transformation; the input ports for the filter
must come from a single transformation. Filter transformations exist within the flow of the mapping and cannot be
unconnected. The Filter transformation does not allow setting output default values.
19.What are default sources which will supported by Informatica Powermart ?
Ans :
• Relational tables, views, and synonyms.
• Fixed-width and delimited flat files that do not contain binary data.
• COBOL files.

0 Comments:
Post a Comment
<< Home