DI Basics: Creating a Custom Operator
- Part 1 of this guide is completed (Dockerfile exists and can be built)
- SAP Data Intelligence cluster
- Cloud (all versions)
- On-premise (3.x)
In cases where high customization is needed, it is often more beneficial to build a customized operator rather than just include a custom image through a group specification. In this scenario we will build a simple custom operator based on Python3 using the Dockerfile we created in part 1 of this guide.
Creating the Operator:
- With the ‘Operator’ tab selected, click on the plus sign to create a new operator:
- Make sure the Base Operator is Python3 since our dockerfile relies on Python3 and pip. The rest of the fields can be customized based on preference.
- In the operator editor we can add ports, scripts, etc. However, we are primarily concerned with the Tags tab. Open this tab and add the exact tags that we defined in the Dockerfile in part 1:
- In newer versions of DI, the tags may be automatically added twice, the second addition is for the subengine operators and should not cause issue. Just make sure both tag categories match.
- In this example I am just replacing the Python3 operator with this custom operator, so I am duplicating the ports and script/code from the original graph using the ‘Ports’ and ‘Script’ tabs respectively:
- Finally, save the operator. If you are following along, pull up the graph from part 1 and replace the Python3 operator with our newly created custom operator:
- Run the graph and check that the Wiretap operator still shows the correct Tensorflow version:
We now have a custom operator that is inheriting our custom Dockerfile image at runtime. This allows for much more complexity and customization in data workflows.