In the realm of DataOps, efficient management of configurations across multiple projects is vital for maintaining consistency and scalability. Imagine a use case of a customer facing the challenge of efficiently managing configurations across multiple projects while ensuring consistency and flexibility. This article outlines a robust solution leveraging DataOps methodologies to streamline configuration management, incorporating a reference project approach and Jinja templating for seamless integration.
Use Case:
The customer seeks to apply a standardised set of roles across multiple projects using a reference project. However, they encounter difficulties in configuring multiple CONFIGURATION_DIR
locations, hindering their configuration management process.
Proposed Solution
To tackle this challenge, we propose a solution that involves defining CONFIGURATION_DIR
as a variable in the custom reference project and subsequently overriding its value in each individual project. Additionally, we suggest storing configuration items within child projects while keeping the configuration directory in the reference project. This necessitates the execution of a pre-job responsible for copying specific files into the reference project's folder during runtime.
Implementation
We provided detailed steps for implementing the proposed solution:
- Define
CONFIGURATION_DIR
as a variable in the reference project. - Override
CONFIGURATION_DIR’s
value in each individual project. - Execute a pre-job responsible for copying specific files into the reference project's folder during runtime.
- Configure the job to create an
artifact
for the reference project to ensure proper rendering of copied files. - Set
DATAOPS_TEMPLATE_DIR
to the same location asCONFIGURATION_DIR
for proper rendering of copied files.
Example Job Configuration
We provided an example job configuration demonstrating the implementation steps:
COPY:
extends: .agent_tag
image: $DATAOPS_UTILS_RUNNER_IMAGE
stage: test
script:
- cp -v $CI_PROJECT_DIR/dataops/snowflake/roles.template.yml $CI_PROJECT_DIR/reference-projects/example_reference_project/example_configuration_dir
artifacts:
paths:
- $DATAOPS_REFERENCE_PROJECT_DIR
To ensure that the SOLE job can properly render the copied files, it's essential to configure it by setting the DATAOPS_TEMPLATE_DIR
to the same location as the CONFIGURATION_DIR
:
CONFIGURATION_DIR: reference-projects/example_reference_project/example_configuration_dir
DATAOPS_TEMPLATES_DIR: reference-projects/example_reference_project/example_configuration_dir
Conclusion
By implementing variable overrides and utilising pre-job scripts, the customer would successfully address their configuration management challenges in their DataOps environment. This solution not only centralises configurations but also enables customisation in individual projects while maintaining consistency across the board. The approach exemplifies the flexibility and scalability of DataOps practices in managing complex data workflows.