Continuous Integration and Continuous Deployment (CI/CD) pipelines have become indispensable tools for software development, helping teams automate the process of building, testing, and deploying their data configuration. DataOps, as a platform for data definition code management, provides a powerful feature for merging configuration into your own projects from reference projects. This allows you to define a job in your project with the same name as a job in the reference project, effectively customising and expanding upon its functionality.
In this article, we'll explore how you can leverage this feature to merge job configurations from reference projects into your own DataOps CI/CD pipeline, using real-world examples to illustrate its capabilities.
The Concept of Job Merging
DataOps’s job merging feature enables you to reuse and customise job definitions from a reference project within your own project. By defining a job in your project with the same name as a job in the reference project, you can extend or modify its behavior. This feature is particularly useful when you want to maintain consistency across multiple projects or repositories while still tailoring specific aspects of your pipeline.
Example Merged Job: Reference Project
Let's start by looking at an example job defined in a reference project:
Example Merged Job:
- echo "1"
- echo "Execute this command after the `script` section completes."
In this reference project, the "Example Merged Job" executes a simple script that echoes "1" and runs an "after_script" to execute additional commands after the "script" section completes.
Example Merged Job: Your Project
Now, you want to include this job in your own project. You can do this in your
- project: dataops-internal/reference-projects/example-reference-project
Example Merged Job:
- echo "2"
In this case, you have defined a job with the same name as the one in the reference project, "Example Merged Job." However, you have modified the script to echo "2" instead of "1."
Understanding the Merging Process
When you include a job from a reference project into your own project, DataOps performs the merging process. In this process, DataOps combines the job configuration from the reference project and the configuration in your project. This means that your modifications in the job definition will take precedence, effectively overwriting the original configuration.
As a result of merging the job configuration into your project, the job will execute with the modified script, in this case, echoing "2," and the
after_script. The original script that echoed "1" is no longer executed. Here's the execution flow:
- The reference project's job configuration is merged into your project.
- Your modified script is executed, which echoes "2."
after_scriptis executed, as it was defined in the reference project.
This process allows you to leverage the work done in the reference project while making project-specific customisations. It's a powerful way to maintain consistency across multiple projects, reducing duplication of effort and ensuring that best practices are followed while still accommodating unique requirements.
DataOps's job merging feature is a valuable tool in the CI/CD pipeline arsenal. It empowers developers and teams to build on the work of reference projects while making necessary adjustments for their specific needs. By defining jobs with the same name as those in reference projects, you can easily integrate and customize configurations, ensuring that your pipelines are both efficient and tailored to your project's requirements. This flexibility makes DataOps a versatile platform for managing CI/CD workflows across multiple projects.