Technical Articles
Getting Started with SAP HANA Cloud | Security, Data Masking, and Anonymization
With this blog series we provide an update with the latest information on getting started with SAP HANA Cloud on the SAP Cloud Platform.
For more information about the free trial, see For the new features overview posts, see Questions? Post as comment. Useful? Give us a like and share on social media.
Thanks! |
Hands-On Video Tutorials
Chief partner engineer, Philip MUGGLESTONE just updated his SAP HANA Cloud playlist on the SAP HANA Academy YouTube channel with four new videos covering the security topics data masking and anonymization.
In this blog, you will find the videos embedded with some additional information and resources.
You can watch the four video tutorials in about half an hour. What you get back is
- How to work with data masking using SQL and calculation views
- How to work with anonymization: k-anonymity, l-diversity, and differential privacy
We created the SAP HANA Cloud instance and explained how to work with HDI containers and the SAP Business Application Studio in the previous blogs, so make sure to read these first.
To bookmark the playlist on YouTube, go to > SAP HANA Cloud
About Data Anonymization
Privacy and Innovation
Data anonymization enables data analysis while protecting the privacy of individuals. This concerns displayed by any client tool through calculation views or using SQL directly.
Data anonymization was introduced with SAP HANA 2.0 SPS 02 in 2018, updated for SPS 04, and now also available for SAP HANA Cloud.
For an introduction to the topic, read
- Going beyond masking: how to anonymize large data sets by Andrea Kristen
- Anonymize like a Rock Star! (or: What’s New on Data Anonymization this Spring in SAP HANA) by Stephan Kessler (2019)
There is also a information area with solution brief, info sheet, and use cases
For the documentation, go to
- Data Anonymization in SAP HANA Cloud, SAP HANA Cloud Security Guide
- Data Anonymization in SAP HANA Cloud, SAP HANA Cloud Administration Guide
Hasso Plattner Founder Award 2020
SAP News Center
Anonymization Methods
Currently, three anonymization methods are supported
- k-anonymity
- l-diversity
- Differential Privacy
All three methods are demonstrated and explained below.
Anonymization – k-anonymity
In the first video tutorial about anonymization, the concepts are briefly covered and sample environment created to set the stage for a view with anonymization parameter k-anonymity.
For the code snippets, go to
[0:00] – Introduction, documentation, and code repository
[2:00] – Create sample schema with demo data
[3:40] – Create hierarchy view and function
[4:25] – Code walkthrough create view with anonymization parameter k-anonymity
[6:35] – Refresh view
Anonymization – l-diversity
In the second video tutorial. we create a view with anonymization parameter l-diversity
As documented,
l-diversity can be applied in addition to k-anonymity if there is a risk that too much homogeneity in a sensitive attribute’s values, in combination with other quasi-identifying attributes, might lead to loss of privacy.
For the parameters, see
[0:00] – Introduction
[0:30] – Tuning data anonymization parameters: k, loss, weight.
[5:20] – Explaining L-diversity
[8:00] – Anonymization Report in SAP HANA cockpit
[9:00] – Using GET_ANONYMIZATION_VIEW_STATISTICS procedure
CALL GET_ANONYMIZATION_VIEW_STATISTICS
('get_names', NULL, 'ANON', 'EMPLOYEES_ANON');
CALL GET_ANONYMIZATION_VIEW_STATISTICS
('get_values', NULL, 'ANON', 'EMPLOYEES_ANON');
Anonymization – Differential Privacy
In the third video, we look at how we can work with differential privacy in the context of data anonymization with SAP HANA Cloud.
As documented,
Differential privacy anonymizes data by randomizing sensitive information but in a way that regardless of whether an individual record is included in the data set or not, the outcome of statistical queries remains approximately the same. Differential privacy provides formal statistical privacy guarantees.
For the parameters, see
[0:30] – Create view using code snippet
[2:15] – Sample queries
[4:30] – Best practices from documentation
CREATE VIEW EMPLOYEES_ANON AS
SELECT ID, SITE, GENDER, AGE, SALARY
FROM EMPLOYEES
WITH ANONYMIZATION (
ALGORITHM 'DIFFERENTIAL_PRIVACY'
PARAMETERS ''
COLUMN ID PARAMETERS '{"is_sequence": true}'
COLUMN SALARY PARAMETERS
'{"is_sensitive": true, "epsilon": 0.5, "sensitivity": 10000}'
);
SELECT E.ID, E.SALARY, A.SALARY
FROM EMPLOYEES E
INNER JOIN EMPLOYEES_ANON A ON (A.ID=E.ID);
SELECT AVG(SALARY)
FROM EMPLOYEES UNION
SELECT AVG(SALARY)
FROM EMPLOYEES_ANON;
SELECT 'RAW' AS TYPE, GENDER, AVG(SALARY) AS SALARY
FROM EMPLOYEES
GROUP BY GENDER UNION
SELECT 'ANON' AS TYPE, GENDER, AVG(SALARY) AS SALARY_ANON
FROM EMPLOYEES_ANON
GROUP BY GENDER
About Data Masking
SQL provides object level access control. With SELECT privilege you can view all table data. Using the view object and analytical privileges, we can obtain a more fine-grained access control and only provide access to certain columns or certain rows.
Data masking provides an additional layer of access control. It does not change the physical aspect of the data (e.g. through encryption) but changes the appearance of the data by making it only partially visible or obfuscated for unprivileged users without returning not authorized error messages.
For an introduction to the topic, read
For the documentation with code samples, go to
- Data Masking in SAP HANA Cloud, SAP HANA Cloud Security Guide
Data Masking
In this tutorial, we learn how to configure data masking.
For the code snippets, go to the SAP HANA Academy repository on GitHub
[0:00] – Introduction, documentation, and code repository
[1:55] – copy/paste code and execute as DEVUSER
[3:10] – view result as ENDUSER
[3:30] – ALTER TABLE to add formula
[4:00] – Using functions
[5:00] – Querying EFFECTIVE_MASK_EXPRESSIONS
[6:00] – GRANT UNMASKED
[6:45] – Data masking using calculation views
[7:00] – SAP Web IDE – new project > Cloud Foundry: SAP HANA Database application
[7:45] – New database artifact using code from snippet and build
[8:00] – Open HDI container in Database Explorer and insert new rows
8:35] – Create new calculation view and add data masking expression
[10:00] – Query data shows masked values
Share and Connect
Questions? Post as comment.
Useful? Give us a like and share on social media. Thanks!
If you would like to receive updates, connect with me on
- LinkedIn > linkedin.com/in/dvankempen
- Twitter > @dvankempen
For the author page of SAP Press, visit
For the SAP HANA Cloud e-bite, see