Technical Articles
Data Masking – a walk in the park with SAP Data Warehouse Cloud
The masking or pseudonymization of data offers an additional level of access control that can be applied to tables and views. Masking a column protects sensitive or confidential data in a particular column of a table or view by transforming the data in such a way that it is only partially visible or completely meaningless to a non-privileged user and yet appears real and consistent.
SAP HANA Cloud offers such a masking function, but only a user who also has the UNMASKED
object privilege for a table or view can display the original data.
Let me show you how you can use Data Masking in the Data Warehouse Cloud with the following two options.
Prerequisites
You need a working Data Warehouse Cloud tenant and sufficient roles/privileges to create (or use) a database user within a space. Additional tools or other requirements are not needed for this.
Option 1 – Data stored in an open SQL schema
Create a database user and open SQL schema
Create a database user and therefore a Open SQL schema with rights for Data Ingestion and Data Consumption in the space. Alternatively, you can use an existing one you might already have available.
Please note that exactly this database user has the UNMASKED
object privilege for all tables and views that you create in this schema using a MASK
statement.
Create a table with a MASK statement
For this example, we are using a simple table with four columns and the following CREATE TABLE
statement. Just replace <YOUR_SCHEMA_NAME> with your created Database User Name.
CREATE COLUMN TABLE "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" (
id INTEGER,
customer VARCHAR(20),
cc VARCHAR(20),
ssn VARCHAR(20),
PRIMARY KEY (id)
)
WITH MASK (cc USING '####-####-####-####') ;
In the example this is a customer table, where the third column represents a credit card number (cc) and the fourth column the social security number (ssn). Copy the SQL statement to your SQL console in DB Explorer and run it.
Use some test data by executing the following INSERT
statements and display the data as shown in the screenshot.
INSERT INTO "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" VALUES (1, 'Jamie', '1234-5678-9012-3456', '123-00-4567');
INSERT INTO "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" VALUES (2, 'Julie', '2222-2222-2222-2222', '222-00-2222');
INSERT INTO "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" VALUES (3, 'Bob', '3333-3333-3333-3333', '333-00-3333');
INSERT INTO "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" VALUES (4, 'Denys', '4444-4444-4444-4444', '444-00-4444');
INSERT INTO "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" VALUES (5, 'Philip', '5555-5555-5555-5555', '555-00-5555');
INSERT INTO "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" VALUES (6, 'Joe', '6666-6666-6666-6666', '666-00-6666');
INSERT INTO "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" VALUES (7, 'Juan', '7777-7777-7777-7777', '777-00-7777');
INSERT INTO "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" VALUES (8, 'Robert', '8888-8888-8888-8888', '888-00-8888');
INSERT INTO "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" VALUES (9, 'Donald', '9999-9999-9999-9999', '999-00-9999');
INSERT INTO "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" VALUES (10,'Pam', '1010-1010-1010-1010', '101-10-1010');
INSERT INTO "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema" VALUES (11,'Linda', '1111-1111-1111-1111', '111-00-1111');
SELECT * FROM "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema";
The cc field is provided with a MASK
directly in the CREATE TABLE
statement. The data preview shows the values unmasked because this user used has the UNMASKED
object privilege.
You can also apply a masking function subsequently to already existing tables. For example, if you created the table using an external ETL tool in the open SQL schema. To demonstrate this, we use an ALTER TABLE
statement to add the masking also to the social security number (ssn) in our example. To show you also another option, we keep the last two digits from the right remain visible.
The following ALTER TABLE
statement does exactly that:
ALTER TABLE "<YOUR_SCHEMA_NAME>"."Customer_OpenSQLSchema"
ADD MASK (ssn USING '###-##-##' || RIGHT(ssn,2)) ;
Use the masked table in Data Warehouse Cloud
If you now enable the data preview on the source table, you can now see the result of the masking for both columns. The data masking is preserved when you use the table in other views and is applied equally to stories in SAP Analytics Cloud.
Option 2 – Data stored in the space schema of Data Warehouse Cloud
Create a table in Data Warehouse Cloud
In the second example, the table with data is located within the space schema of the Data Warehouse Cloud and is exposed to the open SQL schema via a graphical view.
First, you create the table “Customer_SpaceSchema” in the table editor and deploy it:
Upload the same sample data set via a CSV file. For your convenience you can get that sample data file here: https://github.com/kp-sauer/Data-Masking
In the graphical view editor you can now create a new view named “Customer_SpaceSchema_View” based on the table you just created.
Make sure you enable the “Expose for Consumption” setting with this view and deploy it. If you use the data preview you see the unmasked data.
Create a view with a MASK statement
In this case you also use the database user in the open SQL schema to create the masking via the Database Explorer – this time via a database view.
You can simply do that using a CREATE VIEW
statement similar to the CREATE TABLE
statement we have used before, but you can also use a database function just to show you another variant. The advantage is that it can be reused in different use cases using the same logic.
For example, define a function like this:
CREATE FUNCTION "<YOUR_SCHEMA_NAME>"."MASK_CC"
(input VARCHAR(20)) RETURNS output VARCHAR(20)
LANGUAGE SQLSCRIPT AS
BEGIN
output = LEFT(:input,4) || '-####-####-##' || RIGHT(:input,2);
END;
The credit card number is masked, where the first four and last two digits are shown unmasked.
Now the view named “Customer_SpaceSchema_Masked” can be created using the database function and based on the space schema view.
CREATE VIEW "<YOUR_SCHEMA_NAME>"."Customer_SpaceSchema_Masked"
AS SELECT * FROM "BOOKSPACE"."Customer_SpaceSchema_View"
WITH MASK ("cc" USING "<YOUR_SCHEMA_NAME>"."MASK_CC"("cc"));
Use the masked view in Data Warehouse Cloud
Back in the view builder in Data Warehouse Cloud you can use the just created view from the open SQL schema to display the masking result similar to option 1 before.
In the view editor you can now carry out further modeling based on this new view and you will receive the credit card number masked.
More details on the data masking feature of SAP HANA Cloud can be obtained from the SAP help pages: https://help.sap.com/viewer/c82f8d6a84c147f8b78bf6416dae7290/2020_03_QRC/en-US/aaa8d28740ea4cfd907d5a70017b1633.html
Thank you Klaus-Peter. Quite helpful.