Technical Articles
SAP Data Intelligence: Git Workflow and CI/CD Process
This blog post is an overview of the detailed Git Integration and CI/CD Process guide.
About
When SAP Data Intelligence solutions are developed by several developers it is highly desirable to track the code in a Git repository. By this, changes can be traced and reviewed precisely and developers can organise their changes in commits and branches.
In this post we give an overview about how SAP Data Intelligence solutions can be developed using a Git repository (e.g., Enterprise GitHub available in the company). Moreover, we show how the solution can be packaged and tested by a CI/CD server.
This post in an overview of the detailed Git Integration and CI/CD Process guide for SAP Data Intelligence published on GitHub/SAP-samples.
Overview
The Git Workflow and CI/CD Process for SAP Data Intelligence has the following characteristics:
- Each Git repository tracks the files of a single solution.
- We are mapping the user workspace to a Git repository. Hence, each developer will be able to work at one solution at a time.
- Multiple developers can check out the same solution in their user workspace and work on it in parallel.
- The build server knows the credentials of a “test tenant” to install the solution and to execute a single test graph.
- The test graph is usually a SAP Data Intelligence Workflow that will trigger a number of pipelines.
- A test pipeline needs to terminate at some point. The build will be successful if the pipeline terminated successfully.
The following picture shows the basic setup and the actions of the Git Workflow and the CI/CD Process:
- Development: A developer will create/update/delete graphs, operators, or dockerfiles using the SAP Data Intelligence Pipeline Modeler.
- Git Usage: The developer is using the Visual Studio Code application to interact with the Git repository.
- Git Push: Once the developer has reached a working state, he pushes the changes to the repository.
- Build Solution: The build server listens to changes on a specified branch and triggers a new build job. The job will fetch the latest changes to the build server and build a solution package.
- Install Solution: The build server installs the solution on a configured SAP Data Intelligence test tenant.
- Test Solution: The build server executes a configured test pipeline on the test tenant.
Git Integration
In order to connect the files in the user workspace to a Git repository, we need to have a Git client available on SAP Data Intelligence:
- For this, we use the Visual Studio Code launchpad application that has been published under the SAP Sample Code License.
- The application needs to be installed on development tenants such that developers can use it.
- Modifications of graphs, operators, dockerfiles, and other artefacts are visible in the user workspace and hence, in the /vhome folder of the application.
By working with the built-in terminal we can now easily use the Git client to connect the /vhome folder to a Git repository, and track changes of the files. The flow of Git commands looks as follows (for details check the Git Workflow section in the guide):
- Initialise the /vhome folder (only needed once):
vhome/$ git init
- Connect to the remote Git repository (only needed once per solution):
vhome/$ git remote add origin <repository-url>
- Pull the latest changes of branch:
vhome/$ git pull origin <branchname>
- In case you are creating a new solution, you need to add a “manifest.json” file to the /vhome folder to define its name and version:
{ "name":"my-solution", "version":"1.0.0", "format":"2", "dependencies":[] }
- Add modified or new files and commit them:
vhome/$ git add <file> ... vhome/$ git push origin <branchname>
- To work with another repository you can either add a new remote repository and pull the latest changes, or remove the files from the user workspace and initialise it from scratch (step 1)
CI/CD
Once you have the solution tracked in Git you can easily use your existing build infrastructure to test the latest changes.
For this, we make use of the System Management Command-line Client (vctl) and SAP Data Intelligence Pipelines Operations API.
The following build steps can easily be implemented in build server (e.g., Jenkins). Helper scripts are available in the CI/CD section of the guide:
- Package the solution from the files tracked in the Git repository: An example script can be found in the guide under ./scripts/bundle-solution.sh. Note: Since the folder structure of the Git repository (which resembles the /vhome folder) is not exactly the same as the folder structure of a SAP Data Intelligence solution, files need to be moved to a “content” folder first.
- Install the solution on a test tenant using vctl: An example script can be found in the guide under ./scripts/install-solution.sh.
- Execute a test graph using the SAP Data Intelligence Pipelines Operations API: An example script can be found in the guide under ./scripts/test-pipeline.sh.
In the guide it is explained how to setup the CI/CD process using a Jenkins server.
Summary
This post gave a quick overview about how to realise a Git Workflow and a CI/CD process for SAP Data Intelligence solution development. Please check the details in the Git Integration and CI/CD Process guide and feel free to post questions, comments, and suggestions to this blog or open issues in the GitHub project.
Hi Christian,
Do you if it is possible to connect VSCode plug-in on DI Cloud with an on-prem GitLab through SAP Cloud Connector?
Regards,
Philippe
I got an error when add the solution to the strategy
vctl strategy add sdi-default-extension-strategy vscode-app-1.0.1 --verbose
Error: the update was aborted because of the following errors
solution apps found in vscode-app: ["vsc-app"].
Hi Ethan,
regarding your error: On-prem you have to change some settings, for DI Cloud the solution is to open a ticken in launchpad.
"Our OPS team has changed some settings in your cluster to allow installation of third party application such as the VSCode application."
Please try CA-DI-OPS or CA-DI.
Kind regards
Tim
Hi Christian, thanks for the details. this is very helpful. we are just starting this journey. one area we need direction is how to configure the graph/code created in our Dev Tenant to run against our targets (in/out bound), multiple QA tenants, Performance tenants and of course Production tenant. configuration like sources and target systems, kafka topic names unique to the ecosystem, etc. we are thinking to use a config file and jenkin scripting but is there a better way to approach this?
thanks, Mike-
Hi Christian,
Thank you for very useful blog.
A customer, I’m working with, is running GitHub Enterprise (on-premise) for all the development and CI/CD. I need to connect Data Intelligence Cloud with GitHub on-premise.
I created GitHub connection in SAP Cloud Connector, which is assigned to BTP subaccount, where DI instance is. Obviously, it’s not enough. I think I need to tell vctl to go through Cloud Connector, but I have no idea how.
Please, could you help! I think other customers could face with this challenge.
Thanks & Regards,
Elena, SAP Alumni
SAP DI Cloud now supports a Git integration through a Git terminal.
See: https://help.sap.com/docs/SAP_DATA_INTELLIGENCE/1c1341f6911f4da5a35b191b40b426c8/5d7d9e25afb642ed9f4b21f0c62f9871.html?locale=en-US&version=Cloud
So, no need for installing the vscode-app anymore.