Extract-Transform-Load (ETL) is a process of loading data from the source system to the target (data warehouse) based on the transformation rules. Data is extracted from an OLTP database (sometimes text files, legacy systems, and spreadsheets), transformed, and loaded to the data warehouse database.
Types of ETL Tests
Data is vital for all businesses to make crucial decisions. ETL testing plays a great role in verifying, validating, and ensuring that the business data is exact, consistent, and reliable. ETL Testing is data-oriented testing, which involves scrutinizing large volumes of data across heterogeneous data sources. This data-centric testing helps in achieving good quality data by correcting the inaccuracies fixed quickly and effectively.
- Data-Centric Testing goes around testing the quality of the data. The intention of this test is to make sure that valid and accurate data is in the target. It ensures that proper ETL processes are applied on source systems and transformed and loaded data in the target systems. It further assures that proper system migration and upgrades are performed.
- Data Accuracy Testing ensures that the data is accurately transformed and loaded to the target as expected. This test helps in identifying errors raised due to truncation of characters, incorrect mapping of columns, implementation errors in logic, etc.
- Data Integrity Testing helps in identifying the counts of ‘unspecified’ or ‘unmatched’ rows with reference to foreign keys and to check the portion of foreign key matches by querying the data.
- Business Testing ensures that the data fulfills the vital business necessities. Data is evaluated against the business rules defined. This test additionally checks whether or not data has been moved, copied, or loaded fully and accurately.
- Validation Testing (Source to Target Testing) is carried out to validate whether the data values transformed are the expected data values.
- Application Upgrades can be auto-generated, saving substantial test development time. This sort of testing verifies whether or not the information extracted from an older application or repository is precisely the same as the data in a repository or the new application.
- Metadata Testing includes testing of the data type check, data length check, and the index/constraint check.
- Data Transformation Testing is one of the very crucial tests done. In several cases, it can’t be achieved by writing a single SQL query and comparing the output with the target. Multiple SQL queries may have to be run every row to verify the transformation rules.
- Data Quality Testing includes syntax and reference tests. To avoid any error due to date or order number throughout the business process Data Quality testing is done.
- Syntax Tests will report any dirty data, based on invalid characters, character pattern, incorrect upper- or lower-case order, etc.
- Reference Tests can verify that the data is in line with the data model. For example, Client ID Data quality testing includes number check, date check, precision check, data check, null check, etc.
- Incremental ETL testing is done to check the data integrity of previous and new data throughout the process of adding the latest data. Incremental testing verifies that the inserts and updates are being processed as expected in the incremental ETL process.
Below are a few defects we commonly encounter while doing ETL testing.
- Missing or data not loaded to the target database
- Incorrect data transformations while data is being loaded to the target
- Data being lost partially or data truncation
- Data type mismatches
- Handling the NULL values while transforming from Source to Target systems
- Data duplicates in the target
- Incorrect formatting of the precision values while handing the numeric data
- Rejected data due to the data issues
With the constantly changing needs of businesses and similar source systems changes, ETL testing effectively drives continuous change in the data warehouse schema and the data is loaded. Hence, it is very important that processes are clearly defined for both development and testing teams.