Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
LauraNevin
Product and Topic Expert
Product and Topic Expert
You probably already knew that the anonymization of data protects the privacy of individuals while allowing applications such as analytics or machine learning to gain insights from the data.

But did you know that SAP HANA Cloud already packs compelling data anonymization features that ensure that the people described in your data sets remain anonymous?

When and why would I use this feature?


In a data-driven world, a growing amount of your business data contains personal or sensitive information that can inadvertently disclose the identity of the individuals it represents. Anonymization helps to ensure privacy while utilizing that data. However, data anonymization is not simple; removing identifiers such as the name or a social security number is not enough to protect privacy. Clever techniques can still be employed to leverage the unique combination of personal attributes such as age, gender, geographic location, and so on to identify someone. You need an approach that ensures re-identification of individuals is not possible.

Use the SAP HANA Cloud data anonymization features when you must protect the anonymity of the individuals represented by your data while still allowing you to design innovative applications to analyze it.

Data anonymization feature overview


SAP HANA Cloud supports three anonymizing methods: k-Anonymity, l-Diversity (that's an L, by the way), and Differential Privacy. Which method you choose depends on your data and the potential attack scenarios you face.

The following diagram shows the basic workflow involved in protecting data privacy:



What kind of DDL and DML support can I use?


You define anonymization views using the WITH ANONYMIZATION clause of the CREATE VIEW statement. You use the parameters of the WITH ANONYMIZATION clause to set the method and parameters for anonymization.

Here's an example of a WITH ANONYMIZATION clause where you are anonymizing three columns of the view: ID, GENDER, and LOCATION:
WITH ANONYMIZATION ( ALGORITHM 'K-ANONYMITY'
PARAMETERS
'{"k": 5}'
COLUMN ID PARAMETERS
'{"is_sequence": true}'
COLUMN GENDER PARAMETERS
'{"is_quasi_identifier":true, "hierarchy":{"embedded": [["F"], ["M"]]}}'
COLUMN LOCATION PARAMETERS
'{"is_quasi_identifier":true, "hierarchy":{"embedded": [["Paris", "France"], ["Munich", "Germany"], ["Nice", "France"]]}}');

How do I access it?


There is no special access required for this feature. You configure data anonymization using the WITH ANONYMIZATION clause when you create a (SQL) view (CREATE VIEW statement). You then GRANT access to the anonymized view using standard SAP HANA authorization mechanisms.

Once you are using anonymization views, you can create KPIs and then retrieve the generated KPI data using the GET_ANONYMIZATION_VIEW_STATISTICS procedure. The statistics generated by the KPIs help you gain insight into the impacts of the anonymization method and settings you chose.

Of course, SAP HANA Cloud also provides a few catalog views that you can query to get details about the anonymization views in your system:

Where can I learn more?