Data Warehouse Interview Questions – Part 2

Post Info: 5,370 views 0 Comments Post a comment

Q. What are the commonly used indexes in Data warehouse systems?

A. B-Tree and Bit Map indexes.

Q. When Bit Map indexes are used?

A. The main advantage of using bitmap index is for the columns in which the ratio of the number of distinct values to the number of rows in the table is very low. This ratio referred as degree of cardinality.  Example: A gender column, which has only two distinct values (male and female), which is ideal for a bitmap index.

Q. Which scenarios are better suited for using B-Tree indexes?

A. B-tree indexes can be used when a typical query refers to the indexed column and retrieves a few rows. In these queries, it is faster to find the rows by looking at the index.

What is a dimension table?
A dimensional table is a collection of hierarchies and categories along which the user can drill down and drill up. it contains only the textual attributes.
What is Fact table?
Fact Table contains the measurements or metrics or facts of business process.
If your business process is “Sales” , then a measurement of this business process such as “monthly sales number” is captured in the Fact table.
Fact table also contains the foriegn keys for the dimension tables.
What is conformed fact?
Conformed dimensions are the dimensions which can be used across multiple Data Marts in combination with multiple facts tables accordingly
What is ODS?
1. ODS means Operational Data Store.
2. A collection of operation or bases data that is extracted from operation databases and standardized, cleansed, consolidated, transformed, and loaded into an enterprise data architecture. An ODS is used to support data mining of operational data, or as the store for base data that is summarized for a data warehouse. The ODS may also be used to audit the data warehouse to assure summarized and derived data is calculated properly. The ODS may further become the enterprise shared operational database, allowing operational systems that are being reengineered to use the ODS as there operation databases.
What is a lookup table?
A lookUp table is the one which is used when updating a warehouse. When the lookup is placed on the target table (fact table / warehouse) based upon the primary key of the target, it just updates the table by allowing only new records or updated records based on the lookup condition.
What are Aggregate tables?
Aggregate table contains the summary of existing warehouse data which is grouped to certain levels of dimensions.Retrieving the required data from the actual table, which have millions of records will take more time and also affects the server performance.To avoid this we can aggregate the table to certain required level and can use it.This tables reduces the load in the database server and increases the performance of the query and can retrieve the result very fastly.
What are conformed dimensions?
Answer1:
Conformed dimensions mean the exact same thing with every possible fact table to which they are joined Ex:Date Dimensions is connected all facts like Sales facts,Inventory facts..etc
Answer2:
Conformed dimentions are dimensions which are common to the cubes.(cubes are the schemas contains facts and dimension tables)
Consider Cube-1 contains F1,D1,D2,D3 and Cube-2 contains F2,D1,D2,D4 are the Facts and Dimensions here D1,D2 are the Conformed Dimensions
What is a level of Granularity of a fact table?
Level of granularity means level of detail that you put into the fact table in a data warehouse. For example: Based on design you can decide to put the sales data in each transaction. Now, level of granularity would mean what detail are you willing to put for each transactional fact. Product sales with respect to each minute or you want to aggregate it upto minute and put that data.
How are the Dimension tables designed?
Most dimension tables are designed using Normalization principles upto 2NF. In some instances they are further normalized to 3NF.
Find where data for this dimension are located.
Figure out how to extract this data.
Determine how to maintain changes to this dimension (see more on this in the next section).
What are non-additive facts?
Non-Additive: Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.
What are slowly changing dimensions?
SCD stands for Slowly changing dimensions. Slowly changing dimensions are of three types
SCD1: only maintained updated values.
Ex: a customer address modified we update existing record with new address.
SCD2: maintaining historical information and current information by using
A) Effective Date
B) Versions
C) Flags
or combination of these
scd3: by adding new columns to target table we maintain historical information and current information
What are Semi-additive and factless facts and in which scenario will you use such kinds of fact tables?
Snapshot facts are semi-additive, while we maintain aggregated facts we go for semi-additive.
EX: Average daily balance
A fact table without numeric fact columns is called factless fact table.
Ex: Promotion Facts
While maintain the promotion values of the transaction (ex: product samples) because this table doesn’t contain any measures.
Why fact table is in normal form?
Basically the fact table consists of the Index keys of the dimension/ook up tables and the measures.
so when ever we have the keys in a table .that itself implies that the table is in the normal form.
What is degenerate dimension table?
Degenerate Dimensions : If a table contains the values, which are neither dimesion nor measures is called degenerate dimensions.Ex : invoice id,empno
what is junk dimension? what is the difference between junk dimension and degenerated dimension?
Junk dimension: Grouping of Random flags and text Attributes in a dimension and moving them to a separate sub dimension.
Degenerate Dimension: Keeping the control information on Fact table ex: Consider a Dimension table with fields like order number and order line number and have 1:1 relationship with Fact table, In this case this dimension is removed and the order information will be directly stored in a Fact table inorder eliminate unneccessary joins while retrieving order information..
4.what is aggregate fact table?
Aggregate table contains the [measure] values ,aggregated /grouped/summed up to some level of hirarchy.
What is fact less fact table? where you have used it in your project?
Factless table means only the key available in the Fact there is no mesures availalabl

Q.What is a dimension table?

Dimension tables contain attributes that describe fact records in the fact table. Dimension tables contain hierarchies of attributes that aid in summarization. Example: Time Dimension, Product Dimension

Q.What is Fact table?

A. Fact Table contains the measurements or metrics or facts of business process.

Q. What are conformed dimensions?

A. Dimension tables contain attributes that describe fact records in the fact table. Dimension tables contain hierarchies of attributes that aid in summarization. Example: Time Dimension.

Q. What is ODS?

A. ODS means Operational Data Store.  A collection of operation or bases data that is extracted from operation databases and standardized, cleansed, consolidated, transformed, and loaded into an enterprise data architecture. An ODS is used to support data mining of operational data, or as the store for base data that is summarized for a data warehouse.

Q. What is a lookup table?

A. A lookup table is used to display information from one table based on the value of a foreign-key field in another table.

Q. What are Aggregate tables?

A. Aggregate table contains the summary of existing warehouse data which is grouped to certain levels of dimensions. Aggregate table gives better performance for many reporting needs. Aggregate table contains the [measure] values, aggregated/grouped/summed up to some level of hierarchy.

Q.What is a level of Granularity of a fact table?

A. Level of granularity means level of detail that you put into the fact table in a data warehouse. For example: Based on design you can decide to put the sales data in each transaction. Now, level of granularity would mean what detail are you willing to put for each transactional fact. Product sales with respect to each minute or you want to aggregate it upto minute and put that data.

Q. How are the Dimension tables designed?

A. Most dimension tables are designed using Normalization principles upto 2NF. In some instances they are further normalized to 3NF.

Q. What are non-additive facts?

A. Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table.

Q. What are slowly changing dimensions?

SCD stands for Slowly changing dimensions. Slowly changing dimensions are of three types

SCD1: Only maintain updated/Current values.

Ex: a customer address modified we update existing record with new address.

SCD2: Maintain historical and current information by using  A) Effective Date  B) Versions C) Flags or combination of all three of those.

SCD3: By adding new columns to target table to maintain both historical and current information

Q. What are Semi-additive and factless facts and in which scenario will you use such kinds of fact tables?

A. Generally most of the Snapshot fact tables are semi-additive.

A fact table without numeric fact columns is called factless fact table.

Q. What is degenerate dimension table?

A. If a table contains the values, which are neither dimesion nor measures is called degenerate dimensions.Ex : invoice id,EmpId.

Q.What is a junk dimension? what is the difference between junk dimension and degenerated dimension?

A. Junk dimension: Grouping of Random flags and text Attributes in a dimension and moving them to a separate sub dimension.

Degenerate Dimension: Keeping the control information on Fact table ex: Consider a Dimension table with fields like order number and order line number and have 1:1 relationship with Fact table, In this case this dimension is removed and the order information will be directly stored in a Fact table inorder eliminate unneccessary joins while retrieving order information.

Q.What is a fact less fact table?

A. Factless table means only the keys are available in the Fact while there are no measures available.


Return to top

Leave a Reply

 

Are you finished?

Return to top