Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
With SAP HANA 2.0 SPS 03 we have extended our native dynamic data masking capabilities beyond the view level, so you can now mask data at the table level in SAP HANA.

A quick refresher: Dynamic data masking was shipped for the first time with SAP HANA 2.0 SPS 01. Initially, this feature provided an additional layer of access control at the view level, you can read more about it here. What’s new with the SAP HANA 2.0 SPS 03 release is that we have extended this functionality, so you can dynamically mask data at the table level as well.

There are many different use cases for dynamic data masking in SAP HANA. One is that it can provides additional protection for tables or views to protect sensitive or confidential data from power users with broad access. Another use case is that you can leverage dynamic data masking to selectively mask sensitive information in columns such as social security numbers or credit card information from users who should not have access to the full information.



Before we get into the details of how data masking works in SAP HANA, I would like to highlight how dynamic data masking fits into the broader SAP HANA security picture. SAP HANA has a holistic framework for secure information access and masking is just one piece of the puzzle. In the graphic above you can see that we provide support for authentication and user management, authorization management, encryption, auditing, and data anonymization. Which tool you use when depends on which question you are asking and what threats you would like to protect your data against.

Let’s take two features which are often confused, data masking and data anonymization as an example. If you look at data anonymization, the goal is to allow you to enable analytics on whole data sets without exposing individual secrets. Data masking on the other hand would not be sufficient in such a situation as you would still be able to potentially trace back data points to specific individuals. The goal of data masking instead is to selectively or completely hide sensitive information in tables and views to reduce the information exposed from sensitive data. Using data anonymization in such scenarios would not be appropriate as individual row level data is rendered meaningless and the data only has value in aggregate. As you can see, very different use cases for different situations.

Fundamentally, the dynamic table masking feature works in the same way as the dynamic view masking feature. The dynamic data masking functionality is fully integrated into the authorization framework in SAP HANA. Whether or not someone will see masked data is dependent on whether they have the UNMASKED privilege in SAP HANA and permission to do a SELECT on that view or table. Dynamic data masking at the table level does not affect the data that is stored, but rather what is displayed when a user sends a request to SAP HANA. This means that any operations performed by users with sufficient privileges work just as before, whereas users with insufficient privilege are not able to perform calculations with the masked data.

The SAP HANA security guide has extensive documentation on how to configure dynamic data masking at both the table and view level. In the next sections, I will walk you through a few examples of how to add a mask definition to an existing table, generating a masking definition at the time of table creation, and explain how you can define a mask expression as a separate UDF so it can be reused and capture more complex logic. I’ll also share where you can get an overview of the data flow and potential manipulation enforced by a mask definition.

Adding a mask definition to an existing table

Let’s walk through the process of adding a mask definition to a table. For example, let’s say we have an existing table in SAP HANA named DATA_OWNER.BILLING which has credit card information. We may want a support user to only be able to see the last four digits of the credit card number. In order to achieve this, we would need to execute the following command to add a mask definition on the table:

 

ALTER TABLE DATA_OWNER.BILLING ADD MASK (card_number USING '****-****-****-' || RIGHT(card_number, 4));

 

Now when you try to do a select on the table, users with the UNMASKED privilege (and of course select privileges on the table) will be able to see the unmasked data, whereas users who do not have this privilege will not be able to see the full credit card number, only the last four digits as we defined on our table.

Adding a mask definition at the time of table creation

In addition to adding a mask definition to an existing table, you can also define a mask definition at the time of table creation. The process is quite similar. Let’s take the table from our previous example, DATA_OWNER.BILLING. We can use the below SQL statements to create the new table and add the mask definition in one swoop:

 

CREATE TABLE DATA_OWNER.BILLING (name nvarchar(256), card_number varchar(19))

WITH MASK (card_number USING '****-****-****-' || RIGHT(card_number, 4));

 

Similarly, to above, who will now see masked data on the table will be dependent on whether or not they have the UNMASKED privilege and SELECT privileges on that table.

Defining a mask expression as a separate user defined function (UDF)

In certain situations, you may want to use a single mask expression for multiple tables in your database. Perhaps you have a few different tables where credit card data is stored, and you would like that data masked in the same way across the board. This is the perfect situation where you could employ a mask definition defined in a separate UDF. For example:

CREATE FUNCTION DATA_OWNER.MASK_CARD_NUMBER(card_number varchar(19))

RETURNS output VARCHAR(19) LANGUAGE SQLSCRIPT AS

BEGIN

    output := '****-****-****-' || RIGHT(card_number, 4);

END;

Once this UDF is defined you can then create multiple tables which invoke it:

 

CREATE TABLE DATA_OWNER.BILLING (name nvarchar(256), card_number varchar(19))

 WITH MASK (card_number USING DATA_OWNER.MASK_CARD_NUMBER(card_number));

CREATE TABLE DATA_OWNER.CREDIT_CARD (name nvarchar(256), card_number varchar(19))

 WITH MASK (card_number USING DATA_OWNER.MASK_CARD_NUMBER(card_number));

 

Using a UDF to define masking logic also allows you to encapsulate more complex logic into the function. For example, you could create a random alphanumeric prefix instead of just the ‘****’ to mask the credit card number.

 

CREATE FUNCTION DATA_OWNER.MASK_CARD_NUMBER(card_number varchar(19))

RETURNS output VARCHAR(19) LANGUAGE SQLSCRIPT AS

    random_string VARCHAR(12);

    random_prefix VARCHAR(15);

BEGIN

    random_string := LEFT(TO_VARCHAR(HASH_MD5(NEWUID())), 12);

    random_prefix := REPLACE_REGEXPR('([[:alnum:]]{4})' IN :random_string WITH '\1-');

    output := :random_prefix || RIGHT(card_number, 4);

END;

 

CREATE TABLE DATA_OWNER.BILLING (name nvarchar(256), card_number varchar(19))

 WITH MASK (card_number USING DATA_OWNER.MASK_CARD_NUMBER(card_number));

 

This is just one example of using more complex logic to mask a table using a UDF. As you can see, by using UDFs you can have a lot more flexibility for how to configure masked expressions. Please keep in mind that although UDFs provide a high degree of flexibility regarding mask expressions and the logic of when masking should be applied, you should try to avoid excessive complexity in your UDF for performance reasons.

Following data flow and potential manipulation enforced by a mask definition

In situations where you have many objects in your dependency tree and are seeing masked data, it can get complicated to see where the masked data is coming from and how it is getting exposed to certain users. For this reason, I wanted to also highlight that you can use the system view EFFECTIVE_MASK_EXPRESSIONS to quickly find this information. It enables you to follow data flow and potential manipulation enforced by a mask definition. You can find more information about this view in the SAP HANA SQL and System Views Reference guide.

Hope you found the blog helpful, and thanks for reading.
3 Comments