Skip to Content
Technical Articles

Getting Started with SAP HANA Cloud V | Security, Data Masking, and Anonymization

With this blog series we provide an update with the latest information on getting started with SAP HANA Cloud on the SAP Cloud Platform.

  1. About SAP HANA Cloud
  2. SAP HANA Cloud Getting Started
  3. Connecting SAP Analytics Cloud using the HANA Analytics Adapter (HAA)
  4. Cloud Foundry Advanced (space travel, multiple instances, defining schema names)
  5. Data masking and data anonymization
  6. Predictive Analysis Library (PAL) and Automated Predictive Library (APL)
  7. Remote data sources and virtual tables
  8. SAP Web IDE for HANA Development, SAP Cloud Platform Cloud Foundry environment
  9. SAP HANA Cloud and Smart Data Integration
  10. OData with SAP HANA Cloud
  11. HDI with SAP HANA Cloud

For the latest features blog (Oct 2020), see

For more information about the free trial, see

Questions? Post as comment.

Useful? Give a like and share on social media. Thanks!

/wp-content/uploads/2016/02/sapnwabline_885687.png

Hands-On Video Tutorials

Chief partner engineer, Philip MUGGLESTONE just updated his SAP HANA Cloud playlist on the SAP HANA Academy YouTube channel with four new videos covering the security topics data masking and anonymization.

In this blog, you will find the videos embedded with some additional information and resources.

You can watch the four video tutorials in about half an hour. What you get back is

  • How to work with data masking using SQL and calculation views
  • How to work with anonymization: k-anonymity, l-diversity, and differential privacy

We created the SAP HANA Cloud instance and explained how to work with HDI containers and the SAP Business Application Studio in the previous blogs, so make sure to read these first.

To bookmark the playlist on YouTube, go to > SAP HANA Cloud

 

/wp-content/uploads/2016/02/sapnwabline_885687.png

About Data Masking

SQL provides object level access control. With SELECT privilege you can view all table data. Using the view object and analytical privileges, we can obtain a more fine-grained access control and only provide access to certain columns or certain rows.

Data masking provides an additional layer of access control. It does not change the physical aspect of the data (e.g. through encryption) but changes the appearance of the data by making it only partially visible or obfuscated for unprivileged users without returning not authorized error messages.

For an introduction to the topic, read

For the documentation with code samples, go to

/wp-content/uploads/2016/02/sapnwabline_885687.png

Data Masking

In this tutorial, we learn how to configure data masking.

For the code snippets, go to the SAP HANA Academy repository on GitHub

[0:00] – Introduction, documentation, and code repository

[1:55] – copy/paste code and execute as DEVUSER

[3:10] – view result as ENDUSER

[3:30] – ALTER TABLE to add formula

[4:00] – Using functions

[5:00] – Querying EFFECTIVE_MASK_EXPRESSIONS

[6:00] – GRANT UNMASKED

[6:45] – Data masking using calculation views

[7:00] – SAP Web IDE – new project > Cloud Foundry: SAP HANA Database application

[7:45] – New database artifact using code from snippet and build

[8:00] – Open HDI container in Database Explorer and insert new rows

8:35] – Create new calculation view and add data masking expression

[10:00] – Query data shows masked values

/wp-content/uploads/2016/02/sapnwabline_885687.png

About Data Anonymization

Data anonymization enables data analysis while protecting the privacy of individuals. This concerns displayed by any client tool through calculation views or using SQL directly.

Data anonymization was introduced with SAP HANA 2.0 SPS 02 in 2018, updated for SPS 04, and now also available for SAP HANA Cloud.

For an introduction to the topic, read

There is also a information area with solution brief, info sheet, and use cases

For the documentation, go to

Currently, three anonymization methods are supported

  • k-anonymity
  • l-diversity
  • Differential Privacy

All three methods are demonstrated and explained below.

/wp-content/uploads/2016/02/sapnwabline_885687.png

17. Anonymization – k-anonymity

In the first video tutorial about anonymization, the concepts are briefly covered and sample environment created to set the stage for a view with anonymization parameter k-anonymity.

For the code snippets, go to

[0:00] – Introduction, documentation, and code repository

[2:00] – Create sample schema with demo data

[3:40] – Create hierarchy view and function

[4:25] – Code walkthrough create view with anonymization parameter k-anonymity

[6:35] – Refresh view

/wp-content/uploads/2016/02/sapnwabline_885687.png

18. Anonymization – l-diversity

In the second video tutorial. we create a view with anonymization parameter l-diversity

As documented,

l-diversity can be applied in addition to k-anonymity if there is a risk that too much homogeneity in a sensitive attribute’s values, in combination with other quasi-identifying attributes, might lead to loss of privacy.

For the parameters, see

[0:00] – Introduction

[0:30] – Tuning data anonymization parameters: k, loss, weight.

[5:20] – Explaining L-diversity

[8:00] – Anonymization Report in SAP HANA cockpit

[9:00] – Using GET_ANONYMIZATION_VIEW_STATISTICS procedure

CALL GET_ANONYMIZATION_VIEW_STATISTICS
 ('get_names', NULL, 'ANON', 'EMPLOYEES_ANON');
CALL GET_ANONYMIZATION_VIEW_STATISTICS
 ('get_values', NULL, 'ANON', 'EMPLOYEES_ANON');

/wp-content/uploads/2016/02/sapnwabline_885687.png

19. Anonymization – Differential Privacy

In the third video, we look at how we can work with differential privacy in the context of data anonymization with SAP HANA Cloud.

As documented,

Differential privacy anonymizes data by randomizing sensitive information but in a way that regardless of whether an individual record is included in the data set or not, the outcome of statistical queries remains approximately the same. Differential privacy provides formal statistical privacy guarantees.

For the parameters, see

[0:30] – Create view using code snippet

[2:15] – Sample queries

[4:30] – Best practices from documentation

CREATE VIEW EMPLOYEES_ANON AS
  SELECT ID, SITE, GENDER, AGE, SALARY
  FROM EMPLOYEES
  WITH ANONYMIZATION (
    ALGORITHM 'DIFFERENTIAL_PRIVACY'
    PARAMETERS ''
    COLUMN ID PARAMETERS '{"is_sequence": true}'
    COLUMN SALARY PARAMETERS 
    '{"is_sensitive": true, "epsilon": 0.5, "sensitivity": 10000}'
  );
SELECT E.ID, E.SALARY, A.SALARY 
 FROM EMPLOYEES E 
 INNER JOIN EMPLOYEES_ANON A ON (A.ID=E.ID);

SELECT AVG(SALARY) 
 FROM EMPLOYEES UNION 
SELECT AVG(SALARY) 
 FROM EMPLOYEES_ANON;

SELECT 'RAW' AS TYPE, GENDER, AVG(SALARY) AS SALARY 
 FROM EMPLOYEES 
 GROUP BY GENDER UNION 
SELECT 'ANON' AS TYPE, GENDER, AVG(SALARY) AS SALARY_ANON 
 FROM EMPLOYEES_ANON 
GROUP BY GENDER

/wp-content/uploads/2016/02/sapnwabline_885687.png

Share and Connect

Enjoyed the blog? Post a comment, share on social media, and/or give a like. Thanks!

If you would like to receive updates, connect with me on

Best,

Denys van Kempen

/wp-content/uploads/2016/02/sapnwabline_885687.png

Be the first to leave a comment
You must be Logged on to comment or reply to a post.