Skip to Content
Technical Articles
Author's profile photo Thorsten Hapke

Blue Print for Policy Managment

Introduction

Due to the feedback I have got and the task of setting up a new  SAP Data Intelligence policy management I thought a blueprint  and a more elaborate description of the dipolicy-script might be helpful. For an introduction to policy management please read my blog:

Policy Management with SAP Data Intelligence

General Setup Strategies

In a nutshell I list the principles discussed in my previous blog I followed when creating the blueprint.

Separation of “application” and “data” policies

Separate policies of

  • “application”-policies, that only use as resource type “application”, and
  • “data”-policies that contain “connection” and “connectionContent” as resource types.

I prefer to combine these policies only when assigning them to user.

Avoid Resource Redundancies

Hierarchies are important for an intuitive understanding of how policies depend on each other. Nonetheless I separate the very basic policies and inherit the resources only at the last level, that I use for group/user role definitions

Create Role Policies

To keep all policy definitions in one place – the policy management – and not splitting it up between policies and user assignments, I create a role policy that encompasses eventually all policies that a user role or group needs. Only these roles I “expose” to enable the assignment to user. All other policies  I flag as non-exposed.

Application Policy Blueprint

The blueprint I am using as a starting point looks like the following:

Only the policies in the orange boxes are exposed and correspond to a user group. You can download these policies as a zip-file from my personal GitHub.

The basic idea is that I have 2 types of developers

  • ml-developer and
  • developer

where the ml-developer can use additional ml applications. Both roles can use the Connection Management as well and in case the user get the data policy “data.own” she could create her own connections.

For the metadata explorer application I have split the role as well into 2 groups:

  • catalog for the catalog and glossary management
  • quality for the preparation and rulebook management

All user have the authentication to manage the metadata data. If you like to have an additional role that has only “reading” rights you need to create a new role with the policy “basic.metadata”. All metadata users can not start the Connection Manager and therefore cannot add new data sources. They depend on the data sources defined and assigned to them by the system administrator.

The 5th role is the omnipotent user that has unlimited rights to all applications.

Data Policy Blueprint

There is not much to prepare because each customer has a different data source landscape. The basic data policies are

  • *data.own – that allows users to read, write and manage the data of connections they have created by their own
  • *.data.shared – These encompass all the data sources that every user has access to. Normally this are the central object stores like the DI_DATA_LAKE.
  • *.data.all – Finally we have the data system administrator who has the access rights to all data sources. This role I would not use in a productive environment.

Script dipolicy

Installation and Configuration

As already outlined I am using a script when setting up a policy management. You can install the script with

pip install diadmin>=0.0.24
dipolicy --help
usage: dipolicy [-h] [-c CONFIG] [-g] [-d DOWNLOAD] [-u UPLOAD] [-m MYCOMPANY] [-z] [-f FILE] [-a]

Policy utility script for SAP Data Intelligence. Pre-requiste: vctl.

optional arguments:
  -h, --help            show this help message and exit
  -c CONFIG, --config CONFIG
                        Specifies yaml-config file
  -g, --generate        Generates config.yaml file
  -d DOWNLOAD, --download DOWNLOAD
                        Download specified policy. If wildcard '*' is used then policies are filtered or all downloaded.
  -u UPLOAD, --upload UPLOAD
                        Upload new policy (path). If path is directory all json-files uploaded. If path is a pattern like 'policies/mycompany.' all matching json-files are uploaded.
  -m MYCOMPANY, --mycompany MYCOMPANY
                        Replaces mycompany in policy name.
  -z, --zip             Zip policies
  -f FILE, --file FILE  File to analyse policy structure. If not given all policies are newly downloaded.
  -a, --analyse         Analyses the policy structure. Resource list is saved as 'resources.csv'.

The script needs a configuration file config.yaml that you can generate with

dipolicy -g

or create it with : 

URL : 'https://vsystem.ingress.xxxx.shoot.live.k8s-hana.ondemand.com'
TENANT: default
USER : user
PWD : 'userpwd123'
POLICIES_PATH : policies
POLICY_FILTER: mycompany   # regex match of policyID - use '.' for all. Used for analysis-option only


RESOURCE_CLASSES :
  connectionConfiguration: admin
  connection: admin
  connectionContent: data
  app.datahub-app-data.qualityDashboard: metadata
  app.datahub-app-core.connectionCredentials: metadata
  app.datahub-app-data.profile: metadata
  app.datahub-app-data.qualityRulebook: metadata
  app.datahub-app-data.system: metadata
  app.datahub-app-data.catalog: metadata
  app.datahub-app-data.glossary: metadata
  app.datahub-app-data.tagHierarchy: metadata
  app.datahub-app-data.qualityRule: metadata
  app.datahub-app-data.preparation: metadata
  app.datahub-app-data.publication: metadata
  application: application
  systemManagement: admin
  certificate: admin
  connectionCredential: admin

COLOR_MAP:
  admin: black
  metadata: green
  application: orange
  data: blue
  multiple: grey

The main use case is to download/upload polices and prepare for further analysis.

Download Policies

For downloading a policies you can specify with regular expression which policies you like to download, e.g.

dipolicy -d mycompany.basic -z -m pears

This downloads all policies that start with “my company.basic” (option: -d mycompany.basic) and saves them to the “policies”-folder of the working directory. In addition they are also zipped (option: -z). Keep noted that existing files are overwritten without warning. The options -m replaces “mycompany” with the argument value “pears”.

Uploading Policies

The uploading case is similar:

dipolicy -u ./policies/mycompany.basic -m pears

It uploads all policies starting with “mycompany.basic” to SAP Data Intelligence (option: -u). Due to the option -m the uploaded policy is renamed, that means all ‘mycompany” are replaced by “pears”. Due to policy dependencies it could be that in the first run not all policies are uploaded. Please have a look to the console warnings. The reason is that policies that inherit from other policies that are not yet existing will not be added. In this case you just have to restart it again.

Analyses

Graphical visualisations works for me far better than tabular ones. Therefore I added a simple network visualisation that you can call with

dipolicy -a -f policies.json

If the option -f (–file) is not given all policies are downloaded from SAP Data Intelligence and saved to “policies.json” in the policies-path defined in config.yaml before the actual analysis starts. Keep noted that with the config parameter “POLICY_FILTER” you can select the policies you like to analyse, e.g. only your “mycompany.” policies.

The analysis-option produces 2 outcomes:

  • A chart displaying the filtered policies with dependencies
  • resource.csv-file for further analysis with e.g. Excel

For the blueprint policies, the chart of the mycompany policies looks like the following:

Legend:

Node shapes

  • “Diamond”: Exposed nodes
  • “Circle”: Non-exposed nodes

Color coding for the policy classes:

  • admin: black
  • metadata: green
  • application: orange
  • data: blue
  • multiple: grey

Number label: Policy number you find in the “resources.csv” file.

Both the classification of the resources and policies are configurable with “config.yaml”. There is also a threshold when all resources of policies except the number of thresholds (but at least 90%) are of one type then this nearly unique resource type is assigned to the policy. Otherwise the policy is labeled with “multiple”. 

This chart gives you a first glance if the policies look like you planned or not. E.g. in my preparation I saw that I had a couple of nested policies I wanted to avoid.

Conclusion

I am aware that this blueprint is only a rough starting point for a companies user security but hopefully gives you at lease kind of kickstart.

 

 

Assigned tags

      4 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Yuliya Reich
      Yuliya Reich

      Thank you, Thorsten, for sharing your experience!

      It's perfect timing for me 🙂 I'm preparing the security concept for a customer right now. Glad to see, that we think the same way.

      Regards,

      Yuliya

      Author's profile photo Michael Surgeon
      Michael Surgeon

      Thanks Thorsten, these blogs are very good and helpful…but I think I have a use case that doesn’t seem to be covered by your security model.  It might but I’m not seeing how.  I come from the ABAP side so this is all new to me😊

      Our use case: we are leveraging our SaaS Cloud DI tenants to only egress data from our new S4 ecosystem (FMS, MDG, Cfin, and even CAR) to non-SAP data warehouses for integration with our existing legacy/heritage operational and planning analytics applications (non-SAP applications).  We are egressing using the CDS views and the CDC functionality in S4.  We have 2020 Fashion and 1909 MDG as our main sources, 95% of the data.  We are not planning to use the ML functionality in DI.  We are not storing any data in DI.

      Our security problem statement: in DI, only the User ID that starts a Graph can manage it (stop, restart, maintenance, etc. the graph).  Due to this approach, we’ve been forced to run all our graphs via a “dummy” technical user ID to manage these graphs in a production scenario.  The use of a tech ID creates multiple problems:

      • Auditability – who is using it, when did they use it, why are they using it, and what did they do?
      • Managing the tech ID Password – limiting visibility, when to reset, how to reset, reset impacts to automation such as notification processing or integration to external orchestration tooling
      • Ownership – tooling to manage the tech ID, vaulting tools for the PW, controlled access to the Tech ID, etc.

      I don’t think this was envisioned as a use case when developed but DI works very well for S4 data egress to non-SAP environments.  Hence, I don’t think the security is in place on DI to help manage this use case.

      Your thoughts?  Any help would be appreciated.  Again, thanks for the great information, Mike-

      Author's profile photo Thorsten Hapke
      Thorsten Hapke
      Blog Post Author

      Many thanks, Mike.

      What you outlined is correct. What I can say is that we currently consider the concept of how to provide impersonation that allows users within a certain user workspace and the permissions of this "technical" user to run processes. This should enable tracing that logs what each user has done as this "technical" user. In particular for restAPIs we have got this request from customers.

      For the time being we have to resort to this simple "technical" user that hides who has actually done what. The only workaround I can propose to have more than 1 technical user but this comes with other intricateness. I will add your requirement to the existing backlog item.

      Author's profile photo James Giffin
      James Giffin

      Adding on to the technical user discussion - I recently found out that you can also create user certificates with vctl. (https://help.sap.com/viewer/41b069490705457e9426b112a3f052bd/Cloud/en-US/38f6d81551c44f5da0f10bd0249d67f1.html#loioa57a8a82fa3e4460bea050b712c0be02).

      You could use a certificate for your technical user to prevent the management and sharing of passwords.