Top 50 ETL Testing Interview Questions and Answers

In terms of computing, Extract, Transform, Load (ETL) is defined as the general technique of copying the data obtained from single or multiple sources to a destination system that will represent the data differently from that of the source.

ETL Testing Interview Questions
Contents show

What is ETL Testing?

We do the ETL testing to ensure that the data that is loaded from a source to the destination after a business transformation is precise. It will involve the verification of data at different middle stages that are used between the source and destination. 

If you have an ETL Testing Interview scheduled, then you have to go through our TOP ETL Interview Questions and Answers. We have noted the most commonly asked ETL Interview Questions and Answers, and please go through the entire article.

Top 50 ETL Testing Interview Questions and Answers

1. Explain the ETL testing operations?

The ETL testing involves the below-mentioned operations:

  1.  It will validate the data movement from source to the target system.
  2. It does the data count verification in the source and target system.
  3. It verifies the transformation and extraction as per requirements and expectations.
  4. It verifies if table relations join and keys are preserved during transformation.

2. Differentiate between ETL Testing with Manual Testing?

ETL TestingManual Testing
It verifies whether the data is moving as expected.It is a method of observing and testing.
It is fast and systematic and aims to provide top results.It needs time and effort, and it is prone to more errors.
It doesn’t require any additional technical knowledge other than understanding the software.It requires technical knowledge of SQL and Shell scripting.

3. Define ETL?

In terms of computing, Extract, Transform, Load (ETL) is defined as the general technique of copying the data obtained from single or multiple sources to a destination system that will represent the data differently from that of the source.

4. List the types of Data Warehouse applications?

  1. Info Processing
  2. Analytical Processing
  3. Data Mining

5. What is the need for ETL Testing?

  1. To achieve fast and the best results.
  2. To keep an eye on data that is transferred from one system to another system.
  3. To keep track of the speed and efficiency of the process.

 ETL Testing Interview Questions and Answers

6. What is the need for a staging area in the ETL process?

A staging area is also known as a landing zone, is defined as an intermediate storage area that is used for data processing during the extract, transform and load (ETL) process. The data staging area resides between the data source and the data target, which are the data warehouses, data marts, or other data repositories.

7. List the various tools used in ETL?

  1. Cognos Decision Stream
  2. SAS Enterprise ETL server
  3. Oracle Warehouse Builder
  4. Business Objects XI
  5. SAS business warehouse

8. List the responsibilities of an ETL Tester?

The responsibilities of an ETL Tester are listed below:

  1. They should validate mainframe PMS files migrating to DB2.
  2. They have to test the database schema with the help of the data architects by using ERWIN.
  3. They should have experience in working with DB2, Terada.
  4. They have to manage the departmental inventory and troubleshoot the testing, inventory, and reporting issues.

9. Explain the ETL Testing Process?

  1. Analyzing the requirement: The tester has to understand the business structure and their specific requirement.
  2. Validation and Test Estimation: In this phase, the estimation of time and expertise that is required to carry on with the procedure is done.
  3. Test Planning and Designing the testing environment: In this phase, based on inputs from estimations, the ETL environment is planned and worked according to it.
  4. Test Data preparation and Execution: Here, the data for the test will be prepared, and it will be executed as per the requirement.
  5. Summary Report: Here, upon the completion of the test run, The tester should prepare a brief summary report for improvising and concluding.

10. Can you name few ETL bugs?

Types of ETL bugs

A few of the ETL bugs are listed below:

  1. Input/Output bug: Here, valid values are not accepted, and invalid values are accepted.
  2. Load Condition bugs: It will not allow multiple users and customer expected load.
  3. Race Condition bugs: Here, the system cannot run client platforms.
  4. Version control bugs: This will occur in Regression testing.
  5. H/W bugs: Here, the device is not responding to the application.
  6. Help Source bugs
  7. Equivalence Class Partitioning (ECP) related bug

 ETL Testing Interview Questions and Answers

11. What are Dimension Attributes in ETL?

The Dimension Attributes in ETL are defined as the various columns in a dimension table. 

12. Define Fact and list the types of Facts?

A Fact is defined as a central component of the multi-dimensional model that consists of the measures that are to be analyzed. Facts are usually related to dimensions.

Types of facts are:

  1. Semi-additive facts
  2. Non- additive facts
  3. Additive facts

13. Define ETL Mapping Sheets?

ETL mapping sheets consist of all the information of the source and destination tables that include each and every column and also their look-up in reference tables.

14. Differentiate between OLAP tools and ETL tools?

OLAP ToolsETL Tools
The data obtained from the ETL process is used by the OLAP tool to visualize data in different forms.An ETL is a technique of Extracting, loading, and transforming data to a meaningful form.
Example: Business Objects, Cognos, etc.Example: Data stage, Informatica, etc.

15. Define Cubes and OLAP Cubes?

An OLAP Cube is defined as a multi-dimensional database that is mainly optimized for reporting purposes.

Cubes can be defined as the data processing units that are mainly composed of fact tables and dimensions from the data warehouse.

 ETL Testing Interview Questions and Answers

16. How to extract SAP data using Informatica?

  1. By using the power connect option, we can extract SAP data using Informatica.
  2. Next, we have to install and configure the PowerConnect tool.
  3. Now, Import the source to Source Analyzer. In the next step, we generate the ABAP code used for the mapping then the Informatica can pull data from SAP.
  4. Now, to connect and import the sources from external systems, we use Power Connect.

17. Can you list a few test cases and explain them?

  1. Data Check – Here, each and every aspect relating to the data, like Data check, Null check, Number Check, are tested in this case.
  2. Mapping Doc Validation – It verifies if the ETL information is provided in the Mapping Doc.
  3. Correctness Issues – Here, the Misspelled Data, null data, and Inaccurate data are tested.

18. How to write test cases for ETL Testing?

  1. Make sure that all the expected data is loaded into the target table.
  2. You have to compare record counts between source and target.
  3. You need to check for any rejected records.
  4. Check the data; it should not be truncated in the column of target tables.
  5. Check for boundary value analysis.
  6. You need to compare unique values of key fields between the data loaded to WH and the source data.

19. Define Tracing level and list the types of tracing level?

The tracing level can be defined as the amount of data that is placed inside the log files. A tracing level can be classified into two types:

The normal level defines the tracing level in a comprehensive manner.

Verbose defines the tracing levels at each and every row.

20. Differentiate between Power Mart and Power Center?

Power MartPower Center
It doesn’t support any ERP sources.It mainly supports ERP sources like SAP, people soft, etc.
It does not convert local into the global repository.It mainly converts local into the global repository.
It processes a low volume of data.It processes a huge volume of data.

 ETL Testing Interview Questions and Answers

21. Define the term “Grain of Fact”?

In terms of data warehousing, a fact table mainly contains metrics, measurements, or facts of a business process. The grain of a fact table will define the most atomic level at which the facts are defined. The grain of a sales fact table will be stated as “sales volume by day by product by store.”

22. List the steps included in the Staging area?

The steps included in the Staging area are:

  1. Source data extraction and data transformation, i.e., restructuring
  2. Data transformation (i.e., data cleansing, value transformation ).
  3. Surrogate key assignments.

23. Define Data Purging?

Purging is defined as the technique of freeing up space in the database or deleting obsolete data that is not essential by the system. The purge process is based on the age of the data or the type of the data.

24. Define Measures?

Measures are usually the numeric data on the basis of the columns in a fact table.

 ETL Testing Interview Questions and Answers

25. Define Schema Objects?

Schema objects can be defined as the logical structure that directly refers to the database’s data. Schema objects include view, tables, sequence synonyms, clusters, indexes, functions packages, and database links.

26. Define Transformation?

Transformation can be defined as the storage place where the modification, generation, and passing of data take place.

27. Can you define the terms Session, Worklet, Mapplet, and Workflow?

Worklet: It defines a specific set of tasks given

Workflow: It is defined as a set of instructions that will tell the server how to execute tasks.

Session: It is defined as a set of parameters that will tell the server how to move data from sources to target.

Mapplet: It arranges or creates sets of transformation.

28. What is a Factless Fact Schema?

A factless fact schema is defined as the fact table without any measures. It is mainly used to view the number of occurrences of events.

29. Explain the ETL Testing process?

The ETL testing is performed in five stages, namely,

  1. Identifying data sources and requirements
  2. Data acquisition
  3. Implementation of business logic and dimensional Modelling
  4. Build and populate the data
  5. Build Reports

30. List the types of ETL testing?

Production Validation Testing: This type of ETL testing is usually done on data as it is moved into production systems.

Source to Target Testing or Validation Testing: This type of testing is done to validate whether the data values that are transformed are the expected data values.

Application Upgrades: This type of testing is automatically generated by saving the test development time.

Metadata Testing: This testing will include testing of data type check, length check, and index/constraint check.

Data Completeness Testing: This testing is done to verify that the expected data is loaded into a target from a source, data completeness testing is done.

Data Accuracy Testing: This type of testing is performed to ensure the data is loaded accurately and transformed as expected.

Data Transformation Testing: Here, testing data transformation is performed because, in many cases, we cannot achieve this by writing one source SQL query and then comparing the output with the target.

Data Quality Testing:  In order to avoid error due to date or order number during the business process, Data Quality testing is performed.

Incremental ETL testing: This testing is performed to check the data integrity of the old and the new data with the addition of new data.

GUI/Navigation Testing: This testing is performed to check the navigation or GUI aspects of front-end reports.

 ETL Testing Interview Questions and Answers

31.Name the two documents that are used by the ETL tester?

While performing the ETL testing, two documents are used by an ETL tester, they are:

  1.  ETL mapping sheets: ETL mapping sheets consist of all the information of the source and destination tables that include each and every column and also their look-up in reference tables.
  2. DB Schema of Source, Target: These documents should be kept handy to verify any detail in the mapping sheets.

32. List the best practices for ETL testing?

  1. Ensure that the data is transformed correctly.
  2.  Projected data has to be loaded into the data warehouse without any data loss and truncation.
  3.  Make sure that the ETL application appropriately rejects and replaces with the default values and reports the invalid data.
  4.  The tester has to ensure that the data that is loaded into the data warehouse within the prescribed and expected time frames to confirm the scalability and performance.
  5.  All the methods must have appropriate unit tests regardless of the visibility.
  6. In order to measure the effectiveness, all unit tests must use appropriate coverage techniques.
  7.  Create unit tests that target exceptions

33. Define Performance Testing in ETL?

Performance Testing in ETL is defined as a testing technique used to ensure that an ETL system will handle the load of various users and transactions. The main aim of ETL Performance Testing is to optimize and improve the session performance by identifying and eliminating the performance bottlenecks.

 The target and source databases, mappings, sessions, and systems possibly have performance bottlenecks.

34. Name the best tool used for Performance testing?

The best tool used for Performance Testing or Tuning is Informatica.

 ETL Testing Interview Questions and Answers

35. What is Automated ETL testing?

An automated ETL Testing solution is designed to find the data issues in the ETL processes. The unique engines with support for SQL & Groovy will allow end-to-end testing and monitoring. 

36. List the types of transformation?

There are two types of transformation, namely,

Active transformation: It is mainly used to modify the rows of data and the number of input rows passed through them. An example is Filter transformation.

Passive transformation: It is mainly used to get input and output data in the same number of rows. An example is the Lookup transformation.

37. What is Partitioning, and list the types of partitioning?

Partitioning can be defined as the division of data storage in order to improve performance. There are two types of partitioning, namely,

Round-robin partitioning:  It is a type of partitioning that is done to distribute the data uniformly in all the divisions, and it is usually applied when the number of rows for processing are the same.

Hash partitioning:  It is a type of partitioning that is done mainly for grouping the data based on keys and is used to ensure that the processed groups are in a similar partition. Hush partitioning aims to find applications in the Informatica server.

38. What is Informatica?

 Informatica is defined as a software development company that will offer products related to data integration. Products from Informatics are mainly used by ETL, data quality, data masking, master data management, etc.

39. Is there any advantage of using the DataReader Destination Adapter?

The main advantage of using a DataReader Destination Adapter is it populates an ADO recordset in memory, and it exposes the data from a DataFlow task by implementing a DataReader interface in such a way that the other application can consume the data.

40. What is SSIS?

SQL Server Integration Services is defined as a component of the Microsoft SQL Server database software that is used to perform a wide range of data migration tasks. It is a platform for workflow applications and data integration. It also features a data warehousing tool that is used for data extraction, loading, and transformation.

 ETL Testing Interview Questions and Answers

41. Define Filter Transformation?

Filter transformation is defined as an active transformation that is used for filtering the records on the basis of filter conditions.

42. What is the need for a data check as a test case?

With a data check test case, we can easily get the information that is related to data check, number check, and null check.

43. Differentiate between unconnected and connected lookup?

Unconnected LookupConnected Lookup
Here, only one output port can be used.Here, we can use multiple output ports.
Here, we make use of the static cache.Here, we can use either static or dynamic.

44. What is the importance of the correctness issue test case?

The correctness issues test case will help us in understanding the misspelled data, null data, inaccurate data.

45. What is the purpose of dynamic cache and static cache in connected and unconnected transformation?

The static cache is mainly used for flat files, whereas the dynamic cache is mainly used to update the master table by slowly changing the dimensions.

 ETL Testing Interview Questions and Answers

46. Define Data source view?

A data source view usually consists of the metadata that defines the selected objects from one or multiple underlying data sources or the metadata that is used to generate the underlying relational data store.

Apart from the technical questions, the interviewer will ask some general questions and scenario-based questions, which you have to answer based on your experience. Please prepare for the below-mentioned questions as well.

ETL Testing Interview Questions and Answers

47. Why ETL Testing?

48. Tell me something about our company?

49. Why should we hire you?

50. What are your strengths and weaknesses?

Good luck with your ETL Testing Interview, and I hope our ETL Testing Interview Questions and Answers are of some help to you. Make sure to check our Informatica Interview Questions and ETL Interview Questions which might be of some help to you.

Recommended Articles