Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
ThorstenHa
Advisor
Advisor

Introduction


Recently there had been a couple of requirements to present how the "Transportation" of development objects can be realised and how a collaborative development environment could be setup. I took this as a signalling event for putting my thoughts into a blog. For the latter there are plethora of ways to create such an environment. This is good news, nonetheless, for a starting point a kind of best practise could be helpful before your ideas kick in for finding a solution that fits your team best.

There is already a much read blog available from Christian Sengstock: SAP Data Intelligence: Git Workflow and CI/CD Process that I strongly recommend to read or rather work through. It is offering a direct user workspace connection with a GitHub. This blog here gives a more general overview about the options SAP Data Intelligence provides and finally an alternative way transferring your development artefacts to a git repository.

Before I layout my ideas, I like to outline the general options that SAP Data Intelligence provides for transporting development objects to other users including technical users (=role for specific task not assigned to a specific named person). As an environment I take a SAP Data Intelligence Cloud tenant. The transport between Cloud tenants are run via export/import of solutions. A specific discussion of this is excluded for this blog.

Development Objects


There are basically 5 kind of development objects used with SAP Data Intelligence

  • Dockerfiles

  • Custom Operators

  • Pipelines/Graphs

  • Packages

  • Jupyter Notebooks


But only the first three are based on user workspace.

Packages are developed outside of SAP Data Intelligence and added to a dockerfile like any other external package. As a guidance if the number of code lines exceed 200-300 and maybe if these are used for a couple of operators then putting them into a separate package might be good idea. Unfortunately you rarely know how a code is evolving over time.

The Jupyter Notebooks are currently (DI2107) kept outside of the user workspace.You find only a reference to your notebooks in flow (user workspace directory). There are considerations if the notebooks should be stored in the user workspace as well.

 


 

SAP Data Intelligence Internal Transportation


The central hub for exchanging the objects are the solution repository. This you find in the system management application.


 

In the "files" section you define the solution that you can upload as a solution with name and version number to the solution repository. The solution is basically a zipped file of objects including the "files"-path.



 

In "files" you also have the option to export the selected file as file (.tar.zip-format) or as solution (zip-format) to your local file-system. With the current UI of the system management/tenant application you can only export solution via your user workspace.

With the system management command line (vctl)  you have many more options to import/export solutions. I come to this later in more detail.

The overall transportation are all visualised in the image below:


 

The orange shapes indicates that this object is within SAP Data Intelligence. Outside of it are local file-systems of a user or server and a remote Git repository.

Version Control Integration


For the integration of a git repository you have two ways:

  1. User workspace/Launchpad App "VSCode"  - vrep git

  2. User workspace/export to local file-system - local git


The first integration option is described by Christian's blog. The second one I will outline in more detail in this blog. If you already develop custom operators locally/offline, e.g. how I do it (blog) then you are already on this path maybe.

System Management Commandline (vctl)


A pre-requisite for using a local git is to download the SAP Data Intelligence objects. For this we have the powerful tool of the System Management Commandline (vctl). You can download vctl from SAP Software Downloads.for your os.

There are three important command groups that we are going to use:

  1. Uploading and Downloading a solution directly from the solution repository. Something that you cannot do when using the system management app

    • vctl solution upload <source-path>

    • vctl solution download <solution name> <solution version>

    • vctl solution bundle <source>



  2.  Importing/Exporting a solution from the solution repository to user workspace

    • vctl vrep user import-solution <name> <version>

    • vctl vrep user export-solution <name> <version> <source>



  3. Importing/Exporting a solution from a user workspace to local file-system



      • vctl vrep user import <source> <destination>

      • vctl vrep user export <destination> <source>






 

 

 

 

 

With the solution repository, the vctl-commands and a local git-repository, we can setup a collaborative development environment.

Example: Collaborative Development without Automation


Let's assume we have a development team of three

  • Adana

  • Bao

  • Chakshu


and Adana has also the role of a project manager.

The development workflow is designed as follows:

Development

Each developer develops in his own user workspace. During the development process she regularly commits changes to the git repository by downloading to her local file system and sometimes pushes the changes to the remote repository. Once she has achieved a status worthwhile to be shared she exports the development objects to the solution repository. She uses the following naming convention: DEV_<INITIALS>_<PROJECT NAME>. If she needs artifacts from another developer she imports the corresponding solution.

Preparation for Testing

Once the project has achieved that to be ready for testing, the project manager packages the solution as "pending" for testing using the naming convention: P_TEST_<PROJECT NAME>. The project manager is also doing a final commit and push of all project artefacts to the git repositories.

Testing

The project manager exports the solution to the technical user "TESTUSER" and creates a new solution (copy P_TEST_<PROJECT NAME> -> TEST_<PROJECT NAME>) to indicate that the project is now tested.

Production

If the project passed the testing successfully then the solution is copied to PROD_<PROJECT NAME> and deployed to the technical user PRODUCTION.


 

As easy as the whole process looks like and with vctl you have the means to accomplish all tasks, the list of commands might be a bit awkward to use.

My recommendation is to wrap the commands into a script that fits to all what you need. I have done it for myself and you can use it as well. It is a python script that you can install with pip:
pip install diadmin

A pre-requisite is of course the previous installation of vctl.

The usage is quite simple. Go to the folder of your git repository and for the initialisation run the command
didownload -i * *

that makes the required folders and creates a config.yaml-file where you can add your SAP Data Intelligence credentials.
PWD: pwd123
TENANT: default
URL: https://vsystem.ingress.xxx.shoot.live.k8s-hana.ondemand.com
USER: user

Now you can download your development objects and do a commit with one call.
didownload -g operators <myoperators> 

or for all artefacts that you have created within your user workspace
didownload -g * * # for all artifacts

There is also a reciprocal command 'diupload'. The --help option gives you all the information you need. If you like to adjust it you can download the code from my personal GitHub diadmin.

Example: Collaborative Development with Automation


With the scripts and a strategy you can setup an automation or CI/CD server like Jenkins or Bamboo.


Then you can run the sharing of the development objects scheduled each evening for example and you are sure that all object versions are saved in a common git repository. There would be no need for a developer to first check the solution repository if there are new solutions available. You can even setup a triggered sharing process that whenever a developer has created a new version the solution is automatically deployed to all the other members of the development team by polling the solution repo periodically.

If your are using a naming convention then you can also automate the whole end-to-end process from development to production status.

The scripts for the automation server you can either build with vctl or write small wrappers how I have done it.

Conclusion


SAP Data Intelligence enables you to develop with several generations of programming languages. Starting with 3G languages like C, golang, over 4G (Python, ...) to 5G (pipeline modeler, Preparation Rules). The ambition is to optimise the productivity of the user. Nonetheless, the basic development process is the same for all:

  1. (collaborative) development

  2. version control,

  3. testing and

  4. productive usage.


For all the disparate development artefacts you need a common transportation process and repository. I am sure that the openness of  SAP Data Intelligence from an API angle and the used formats perspective enables you to comply the development process on this platform to the requirements of your company.