One of the most significant and complex aspects of testing data pipelines and DataOps is that data models consistently shift due to schema changes and different user requirements. Therefore, keeping test sets lined up can become time-consuming when everything changes continually and rapidly.
We solved this challenge by implementing the following:
-
Ensure your tests are stored in the same Git repo as your configuration and code files so that as you make changes and deploy them, the functional changes and tests are deployed together.
-
Ensure your tests are defined in the same place as (or alongside) your functional logic. If you have data modeling defined in one place and tests in a different location, it is virtually impossible to keep them in sync.
Note that the same applies to grants and permissions. If you define them together with the functional code, it is much easier to manage and more challenging to make mistakes.
-
Deploy your functional changes using an automated declarative approach like the Snowflake Object Lifecycle Engine (SOLE) found in our DataOps platform. This removes the need to write endless ALTER TABLE statements.
For more information about data testing, check out MATE Automated Data Testing.
