Introduction

ThorstenHa · ‎09-23-2021

Introduction

Although the security setup and policy implementation feels something like a bookkeeper's or caretaker's job it is necessary, even vital for any application. If you recall the role that the caretaker played in your school, you have to admit that she was the most important person in the organisation. Even more important than the head of school. I have learnt that analysing the policy setup could be a lot of fun and defining strategies needs some thinking and creativity. So let's emerge into the fundamentals of the SAP Data Intelligence Policy Management and User Security first before moving on to analysing the setup. At the end you know how to implement a user security and get some ideas how to do some in depth analysis. Finally I will give you examples how to setup a reasonable policy structure.

Basics

User Security

The User Security is based on the assigned policies. A user can have many policies as you see in the screenshot. There are no user roles or groups for setting the security. This is all handled via the policies where you have all the flexibility you might need.

Policy Management

The most fundamental items in the Policy Management are the resources as they are named in SAP Data Intelligence.

A resource consists of a resource type and their attributes which depend on the resource type, e.g.

application with activity and name of application

connection with activity and identifier of connection

Currently we have 19 resource types and most probably this number will expand with each release.

These resources are grouped under a policy id or in other words a policy is defined by a number of resources. Some policies are very basic and contain only one service element, e.g.

Policy ID: sap.dh.modelerUI.start
- Type:application
- Activity:start
- Name:modeler-ui

and some policies are more complex like Policy ID: sap.dh.member with 26 resources.

For building a more elaborate policy structure, policies can inherit resources form other policies. There are even policies that have only inherited resources like 'sap.dh.admin'.

You have to tag a policy as "exposed" to assigned it directly to a user. Non-exposed policies needs to be nested in another policy.

That simple it is. Now you have to learn what kind of resources you have at hand to impose your ideas of user security on SAP Data Intelligence, then defining them in addition to the pre-defined ones and finally put these into a kind of policy hierarchy that helps you keeping it under control.

Avowedly the resulting policy framework could be rather complex and the list structure of the System Management UI is not helping you very much in your research in finding loopholes or redundancies. There are considerations what kind of standard tools should be added but this is still in its beginnings because the security has got recently a lot of additional features.

Fortunately SAP Data Intelligence is quite open in regards of APIs and provides tools for exporting system information. In our case of system management vctl (System Management Command-Line Client) is our silver bullet.

Command-line Access to Policy Data

After its humble beginnings vctl is now a terrific tool that helps System Administrators to automate their daily tasks by integrating them into shell-scripts. It is definitely worthwhile to have a look to it after each new release. For a short introduction I recommend reading Community blogs like the one of Gianluca de Lorenzo: Zen and the Art of SAP Data Intelligence. Episode 3: vctl, the hidden pearl you must know.

For reading the policy details we need in addition to the login vctl call:

Get list of policies:
- ```
vctl policy list-policies --format json
```

Get details for each policy:

vctl policy get <policy Id> --format json

These I have wrapped into a python script that finally gives me all the policies with its resources and a list of policies from which additional resources are inherited. Unfortunately you not get a 'flat' list of all resources of each policy including the inherited resources. These you have to compute in a second step.

The script "dipolicy" described at the end of the blog let's you do it.

Analysing Policy Dependency

The first question is how all the policies depend on each other. A lot of standard policies neither inherit from or pass resources to other policies. For our demo system with only a couple of additional policies 13 policies out of 181 have no dependency, e.g.

Policy ID	Resource Type	Activity
Rules_Dashboard	connectionContent	manage
sap.dh.connectionCredentialsUnmasked	app.datahub-app-core.connectionCredentials	showUnmasked
sap.dh.datahubAppAuditlog.start	application	start
sap.dh.datahubAppDex.start	application	start
sap.dh.datahubAppPolicy.start	application	start
sap.dh.datahubAppPreparation.start	application	start
sap.dh.licenseManager.start	application	start
sap.dh.metadataStart	application	start

Most of the policies are building blocks for other policies. To visualize these I have adjusted the plotting functions of networkx that is using matplotlib. The following picture only shows policies that are dependent from one another.

The numbering of the nodes refer to policies as you see in the table that is saved separately as 'chart_policy_ids.csv' by the dipolicy-script.

Num ID	PolicyId	Num ID	PolicyId
0	root	34	sap.dh.datahubAppLogging.start
5	app.datahub-app-data.administrator	37	sap.dh.datahubAppScheduler.start
6	app.datahub-app-data.businessUser	38	sap.dh.datahubAppSystemManagement.start
7	app.datahub-app-data.dataSteward	39	sap.dh.datahubAppTask.start
8	app.datahub-app-data.metadataUser	40	sap.dh.developer
9	app.datahub-app-data.preparationUser	41	sap.dh.dspGitServer.start
10	app.datahub-app-data.publisher	42	sap.dh.jupyter.start
11	connection_S4_Test	44	sap.dh.member
12	monitor	45	sap.dh.metadata
13	sap.dh.admin	47	sap.dh.metricsExplorer.start
14	sap.dh.applicationAllStart	48	sap.dh.mlApi.start
15	sap.dh.automl.start	49	sap.dh.mlApiImporter.start
16	sap.dh.axinoService.start	50	sap.dh.mlDeploymentApi.start
17	sap.dh.certificate.manage	51	sap.dh.mlDmApi.start
18	sap.dh.connectionContentAllManage	52	sap.dh.mlDmApp.start
19	sap.dh.connectionContentOwnerManage	53	sap.dh.mlScenarioManager.start
21	sap.dh.connectionMgtStart	54	sap.dh.mlTracking.start
22	sap.dh.connectionsAllRead	55	sap.dh.modelerStart
23	sap.dh.connectionsAllWrite	56	sap.dh.modelerUI.start
24	sap.dh.connectionsOwnerRead	57	sap.dh.resourceplanService.start
25	sap.dh.connectionsOwnerWrite	58	sap.dh.shared.start
26	sap.dh.dataHubFlowAgent.start	59	sap.dh.stopAppsForOtherUsers
28	sap.dh.datahubAppCore.start	60	sap.dh.systemAccess
29	sap.dh.datahubAppDaemon.start	61	sap.dh.systemMgtWrite
30	sap.dh.datahubAppData.start	62	sap.dh.trainingService.start
31	sap.dh.datahubAppDatabase.start	63	sap.dh.voraTools.start
33	sap.dh.datahubAppLaunchpad.start	64	sap.dh.voraadapter.start

All policies that have no connection to the root node, are the "non-exposed" policies that could only be used as part of other policies. There is no "root"-policy defined in the SAP Data Intelligence Policy Management but it helps when visualising the graph.

At first glance you see only 3 levels for the pre-defined policies. SAP Data Intelligence used productively will have at least 1 or 2 levels more when assigning these policies to user groups.

Before we come to the best strategy of building a policy structure I like to introduce an additional classification that I found most helpful and explains the colour coding of the graph.

Additional Resource and Policy Classification

With SAP Data Intelligence 2107 we have 18 resource types on the one hand and on the other I have basic user roles and usage constraints, like administrator, business user, applications and data. When doing a mapping of the rather specific resource types to a few basic classes it might help to get a better overview about the policies. My proposal is the following

Resource Type	Class
connectionConfiguration	admin
connection	data
systemManagement	admin
certificate	admin
connectionCredential	admin
connectionContent	data
application	application
app.datahub-app-data.*	metadata

For a policy classification I use the resource classification. If all contained resources are of the same class then the policy adopts the class. If a policy has resources from more than one class it gets the classification: multiple.

This explains now the colour coding of the graph and you see that the policies are quite uniform based on the chosen classification. Mostly policies that define a role have multiple resource types what absolutely make sense. This proves that predefined policies have been carefully designed.

Best Practise of Policy Management

After the analysis of the pre-defined policy we can already derive a kind of best practise. Before doing that let me emphasise that having no user group concept in place is not a shortfall but has its reasons for having more transparency. The SAP Data Intelligence Policy Management is the only single source for defining the authorisations of its applications and managed data. There you can define roles and groups, check on consistency and reasonability and finally assign these to users. With a separate user-group hierarchy you have two chances for unwanted results.

With the policy management you can fulfil 3 targets by assigning

Application Security
- only eligible applications to user/groups
- authorisations to user/groups for reading and changing metadata of data

Data Security
- authorisation of reading and changing data

Admin Security
- authorisation for system administration

The last role I will not cover because there is a pre-defined admin-policy that is mostly used unchanged. If there is a need for adjustments it is quite straight forward without a complex strategy behind.

Applications Security

There are predefined policies that are intended to define a role like sap.dh.member or sap.dh.developer. Admittedly it is a very rough classification and should enable customers to assign quickly roles/policies without too many exchanges with the system administrator about missing authorisations. Once SAP Data Intelligence is used productively having more divers users then you should spend some time what roles your company has and how to define the policies accordingly.

My proposed strategy is to use the nested or inheriting feature of policy management extensively. This means firstly start with a policy that every user needs. This corresponds to the predefined role of sap.dh.member but with much less policies. The minimum list of policies that let a user start the launchpad of SAP Data Intelligence without any additional application is:

sap.dh.shared.start

sap.dh.systemAccess

sap.dh.datahubAppLaunchpad.start

Additionally you need some more applications to ensure that the basic processes work:

sap.dh.datahubAppScheduler.start

sap.dh.datahubAppDaemon.start

sap.dh.datahubAppDatabase.start

sap.dh.resourceplanService.start

sap.dh.dataHubFlowAgent.start

sap.dh.datahubAppCore.start

sap.dh.datahubAppTask.start

sap.dh.axinoService.start

sap.dh.datahubAppData.start

These I group into my most basic policy "mycompany.basic.member". I add no additional resources to the nested ones. In anticipation of what I will later outline in more detail you could either embed this basic policy into every subsequent, kind of next level policy, or you add it only to policies corresponding to a user group or you assign this to each user directly.

As a next step I like to create a user that has only the right to start the metadata. I am not adding the predefined "app.datahub-app-data.metadataUser" because it encompasses the policy "app.datahub-app-data.dependencies" that has resources for starting Applications like the Connection Management that I want to avoid on this level. Therefore I create my next policy "mycompany.basic.metadata" with the nested policy "mycompany.basic.member" added previously. The resources I added have only the basic activity attribute "read".

resourceType	activity
app.datahub-app-data.catalog	read
app.datahub-app-data.glossary	read
app.datahub-app-data.profile	read
app.datahub-app-data.tagHierarchy	read

With this we enter the next level of security.

Metadata Security

It might seem to be a bit over-sophisticated to distinguish between policies for starting applications with basic reading authorisation and modifying metadata but for me it helps to sort this out in the first approach.

As a next step I like to define 3 additional roles with more responsibilities:

Catalog manager who can publish´and annotate datasets and manage tag hierarchies

Glossary manager who can define new glossaries

Catalog admin who owes the rights of the 2 previous roles

My recommendation is that for each role a user is created in order to test if the imposed security level actually fits the intended one.

The "mycompany.manage.catalog" resources inherit resources from the previously defined policy "mycompany.basic.metadata" and I am adding the resources:

"mycompany.manage.catalog"

ResourceType	Activity
app.datahub-app-data.catalog	annotate
app.datahub-app-data.catalog	manage
app.datahub-app-data.publication	manage
app.datahub-app-data.publication	execute
app.datahub-app-data.tagHierarchy	manage
app.datahub-app-data.tagHierarchy	admin

for the glossary I define a similar policy

"mycompany.manage.glossary"

ResourceType	Activity
app.datahub-app-data.glossary	manage
app.datahub-app-data.glossary	admin

If I like to have groups that correspond to a policy then you can either choose a policy that you have created already or combining several policies into one policy without any further resources. E.g. you like to have a group whose members are allowed to manage the metadata of the catalog and the glossary then you can create an additional one

"mycompany.admin.catalog"

that inherits its resources from mycompany.manage.catalog and mycompany.manage.glossary.

Strategies for Setting up Application and Metadata Policies

Now that we have seen how to setup policies we have to think about the general strategy of how the policies should be structured.

I found 3 different strategies that might help you to define your own:

User Tailored

Flat Group

Hierarchy Group

In the "user-tailored" strategy you define basic policies and assign them to users.

This is a very flexible approach because you can if necessary define ad-hoc policies without any major impact on the whole policy framework. What you finally get is rather unstructured and therefore mostly useful to smaller user groups. Furthermore you need to analyse both policy management and user assignments to get the full picture of your security settings.

To overcome the latter challenge you can collect the rather flat policies into a "role"-policy that you then assign to users.

This provides you a singe source of truth for your policy and user assignment and helps you to design very specific user profiles or user groups while retaining the fore-mentioned flexibility. The downside is the lack of an intuitive structure. Therefore this strategy is rather appropriate for a homogeneous and limited number of user profiles.

For a more structured approach you can impose a hierarchy on the policies by extensively using the inheritance feature.

This is a rather intuitive approach where you have all the definitions in one place. The downside is that there might be redundant resources in the final policies that actually do not harm but it could obscure some unwanted assignments when inheriting multiple levels.

My preferred approach is to have a combination of the "flat-group" and the "hierarchy-group" approach, e.g. I would assign the "basic.member"-policy directly to the final policy and keep the "metadata"-hierarchy. This would result in limited redundancy of a specific part.

For testing purpose I always start with assigning the policies to a test user directly and see if the security works as intended before creating a new policy that combines all this policies. By the way I leave this test user as a template to always been able to check again and adjust the security of a user/policy group.

Although I am advocating for reflecting user groups in the policies, I propose having separate policies for data and assigning the "application"-policies and "data"-policies separately to user. This is an additional step where you can check if the right users really get the right authorisations. Leaks in 'data'-security could cause much more harm than the application security.

Data Policy Management

Introduction

Before we go into the details of the data related policies some preliminary words about the authorisation principle regarding data sources. SAP Data Intelligence is using the authorisations of data sources and the credentials provided to establish a connection. E.g. the user of HANA Database used as credentials for the connection in the Connection Management would define the authorisations. Only the "read" and "write" attribute in the resource definition for the resource type "connection" adds a restriction. The additional activity attributes "read/writeOwner" refers to the connections that the user has created by herself. This attribute makes only sense when wildcarding the name of the connection.

There are plans to tighter integrate SAP Data Intelligence with Identity Management service. On the roadmap for release 2113 there is a mapping planned of SAML authorisations to policies. Documented in a backlog item we like to propagate the security further to the data sources. This means much more effort because the security API for each supported data source needs to be addresses. Due to missing standardisation it means a lot of development effort.

To grant a user access to data sources you need to provide a "connection" policy where you either specify the connection id of the data source or as a card blanche give access to all connections with the wildcard '*'. Of course the latter should be avoided.

There is another resource type "connectionContent". The notion might be a bit misleading. It is not referring to the data as content but to the metadata management in the sense e.g. creating folders or uploading data in the metadata explorer. This policy is only used by the metadata explorer and not with the Modeler. In particular this policy is important when using the data preparation and the rulebooks.

Data Policy Examples

The structure of data policies are less complex than that of the application and metadata policies, for good reasons because the impact on the security could be much more severe. Your most probably will have different business areas which data should be accessible by different groups of employees with different responsibilities. For the following let's assume you have the following data security levels with attached data sources

Users Own Data Sources
Users who can add "connections" to the Connection Management can access the data and the metadata without any restrictions

Basic Shared Data Sources
Read and write access and metadata management for data sources of non-confidential data

Info Data Sources
Read access only

Manage Info Data Sources
Write and manage authorisation of Info data sources

These examples should provide the patterns that enables you to extend the data policies to the needs of our company.

User's Own	*.basic.connections	sap.dh.connectionContentOwnerManage - connectionContent - ownerManage - * sap.dh.connectionsOwnerRead - connection - ownerRead - * sap.dh.connectionsOwnerWrite - connection - ownerWrite - *
Basic Shared	*.shared.connections	connection - read - HANA_DEMO connection - write - DI_DATA_LAKE connection - write - HANA_DEMO connection - read - DI_DATA_LAKE connectionContent - manage - DI_DATA_LAKE connectionContent - manage - HANA_DEMO
Info	*.info.connections	connection - read- HANA_QM
Manage Info	*.info_manage.connections	connection - write - HANA_QM connectionContent - manage - HANA_QMinherited: mycompany.info.connections - connection - read - HANA_QM

For each data policy resource you need to specify a connection. The exception is the wildcard '*' that refers to all connections and should of course used very limited.

The inheritance feature is of limited use for data policies. For the same group of data you only have the three levels of read, write and manage that could be used as a vertical order.

Conclusion

I hope this helps you when designing a user security on your SAP Data Intelligence system. Please watch out for each release what is new for policy management. You can expect new features that support system administrators in their daily business but also a tighter integration with Identity provider systems.

For this blog I developed a script that enables you to download and to upload policies, and does the analysis shown in this blog. You can install the package by

pip install diadmin

and run it with "dipolicy":

dipolicy --help

usage: dipolicy [-h] [-c CONFIG] [-g] [-d DOWNLOAD] [-u UPLOAD] [-f FILE] [-a]



Policy utility script for SAP Data Intelligence. Pre-requiste: vctl.



optional arguments:

  -h, --help            show this help message and exit

  -c CONFIG, --config CONFIG

                        Specifies yaml-config file

  -g, --generate        Generates config.yaml file

  -d DOWNLOAD, --download DOWNLOAD

                        Download specified policy. If 'all' then all policies are download

  -u UPLOAD, --upload UPLOAD

                        Upload new policy.

  -f FILE, --file FILE  File to analyse policy structure. If not given all policies are newly downloaded.

  -a, --analyse         Analyses the policy structure. Resource list is saved as 'resources.csv'.

You can download the example policies from my personal GitHub policies folder.

So have fun and be not too strict with your user.

There is now a follow-up blog that outlines a blueprint policy structure.

Policy Management with SAP Data Intelligence

Introduction

Basics

User Security

Policy Management

Command-line Access to Policy Data

Analysing Policy Dependency

Additional Resource and Policy Classification

Best Practise of Policy Management

Applications Security

Metadata Security

Strategies for Setting up Application and Metadata Policies

Data Policy Management

Introduction

Data Policy Examples

Conclusion

Get Your SAP HANA Idea Incubator Badge Today!

SCN Mission - SAP HANA Quiz Challenge is now retired

Share your #HANAStory and Win