Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
markmumy
Advisor
Advisor

With the 2022 Q1 release of SAP HANA Cloud, the SAP HANA Database (HANA DB) features included the ability to read and write from SAP HANA Cloud, data lake files (HDLFS).

 

The manuals contain a lot of valuable information, but it is spread between both product sets.  Use this manual as a primer: https://help.sap.com/viewer/f9c5015e72e04fffa14d7d4f7267d897/2022_1_QRC/en-US/462c861413b043bd93b9e8...

The purpose of this blog is to bring that information together in a single location with a step by step process to work with both HANA DB, HDLFS, and HDLRE.

Just to level set on terminology:

    • HANA DB -- SAP HANA Cloud, SAP HANA database

 

    • HDLRE -- SAP HANA Cloud, data lake relational engine

 

    • HDLFS -- SAP HANA Cloud, data lake files



 

Quite a lot of topics can be found in other areas. To consolidate that list:

 

 



 

Prerequisites


Before beginning the steps below, make sure to Set Up Initial Access to HDLFS.  In order to set up the HANA security to HDLFS, HDLFS must be configured for trusted, external access.  In order to complete some of the later sections that use the HDLFS command line interface, you must first install the SAP HANA Cloud, data lake client on your platform.

 

Once that is done, the HANA DB process will include include:




    • Create PSE (SAP HANA Private Security Environment)

 

    • Add the Digicert certificate to PSE

 

    • Add the local credentials to the PSE

 

    • Add the HDLFS certificate to the PSE

 

    • Create a HANA Import/Export Credential that maps to the PSE (and by extension the certificates)

 

    • Test things out!



 

Create PSE (SAP HANA Private Security Environment)

 

drop PSE MYPSE cascade;



Create PSE MYPSE;


 

Add Root DigiCert Certificate to PSE


Once you have created a HANA DB PSE, you must then upload the root DigiCert certificates.

    • Must use DigiCert certificate

 

 

 

    • This CREATE CERT should work for all HANA clouds as this is the above PEM file.



CREATE CERTIFICATE FROM '<Digicert PEM Certificate>' COMMENT 'HDLFS';


 

Once created, you need to get the certificate ID as that’s used to reference it:



SELECT CERTIFICATE_ID FROM CERTIFICATES WHERE COMMENT = 'HDLFS';


 

Add The Root Certificate to the PSE

 

ALTER PSE MYPSE ADD CERTIFICATE <certID_from_above>;


 

Add the HDLFS Certificates to the PSE

 

For the purposes of this blog, we will use a self-signed set of certificates generated using the OpenSSL utilities available on most platforms.  Alternatively, these files can be given to you by a security administrator.


 

Follow this process to generate the files using OpenSSL on any host with OpenSSL installed.  One thing of note in these commands is the "-days" parameter.  This is the life of the certificates.  If you set it too low, you will have to constantly regenerate this data.  If you set it too high, then it may violate your corporate standards.  For the purposes of illustration and use in demo systems, I have chosen to use 9999 days as the life of these keys.



openssl genrsa -out ca.key 2048



openssl req -x509 -new -key ca.key -days 9999 -out ca.crt



openssl req -new -nodes -newkey rsa:2048 -out client.csr -keyout client.key



openssl x509 -days 9999 -req -in client.csr -CA ca.crt -CAkey ca.key -CAcreateserial -out client.crt



openssl verify -CAfile ca.crt client.crt



openssl x509 -in client.crt -in client.crt -nameopt RFC2253 -subject -noout


 

Once generate, you need to add these certificates to your HANA PSE.  There are three files, and thus three sections that must be included between the single quotes below:



ALTER PSE MYPSE SET OWN CERTIFICATE '<files here>';


 

For <files here> use the full contents of these files (including the BEGIN/END comments) that were generated with OpenSSL:




    • client.key

 

    • client.crt

 

    • ca.crt



 

Create a HANA Import/Export Credential that maps to the PSE (and by extension the certificates

 

CREATE CREDENTIAL FOR USER DBADMIN COMPONENT 'SAPHANAIMPORTEXPORT' PURPOSE 'myCredential' TYPE 'X509' PSE 'MYPSE';

 

 

 

Export Test

 

create table EXPORT_TEST as ( select * from dummy );



-- add more data, as little or as much as you wish

insert into EXPORT_TEST select * from EXPORT_TEST;

insert into EXPORT_TEST select * from EXPORT_TEST;



export EXPORT_TEST as parquet

into 'hdlfs://<HDLFS endpoint>/hdb_export_test'

WITH CREDENTIAL 'myCredential'

;




 

See Exported Files in HDLFS Using hdlfscli

 

Using the SAP HANA Cloud, data lake files command line interface (hdlfscli) as part of the SAP HANA Cloud, data lake client kit, you can see the files exported:



hdlfscli -k -cert client.crt -key client.key -cacert ca.crt -s hdlfs://<HDLFS endpoint> lsr /hdb_export_test

 

Outputs this:



FILE      mark mark           93 666  index/DBADMIN/EX/EXPORT_TEST/create.sql

FILE      mark mark          200 666  index/DBADMIN/EX/EXPORT_TEST/data.parquet

FILE      mark mark          605 666  index/DBADMIN/EX/EXPORT_TEST/export_system.info

FILE      mark mark         2176 666  index/DBADMIN/EX/EXPORT_TEST/table.xml


 

HANA DB Import Test


You can either import the data, tested here, or you can have the IMPORT statement create the object and load data, not shown here.  For the data import, simply point the IMPORT statement to the parquet file generated during the export.

truncate table EXPORT_TEST;



import from parquet file 'hdlfs://<HDLFS endpoint>/hdb_export_test/index/DBADMIN/EX/EXPORT_TEST/data.parquet'

into EXPORT_TEST

WITH CREDENTIAL 'myCredential';



drop table EXPORT_TEST;


 

SAP HANA Cloud, data lake relational engine (HDLRE) Import Test


Alternatively, we can also load this same parquet data into the HANA Cloud data lake relational engine.  With HDLRE, you can only load files from the directly attached HDLFS container.  The key to the above HANA DB statements is that the HDLFS enpoint that is used to export from HANA be the same as the one created when the HDLRE engine and instance is created.

create table EXPORT_TEST ( COL1 varchar(10) );



load table EXPORT_TEST ( COL1 )

using file 'hdlfs:///hdb_export_test/index/DBADMIN/EX/EXPORT_TEST/data.parquet'

format parquet

quotes off escapes off

;



select * from EXPORT_TEST;



drop table if exists EXPORT_TEST;


 

Misc Notes


This blog does not cover the full use of HDLFS with HANA DB.  In fact, a few areas have not been covered here, but are worth mentioning.

First, SAP Data Warehouse Cloud (DWC) can leverage the SAP HANA Cloud, data lake files repository as a feed to integrate and load data into DWC.  We often don't think of data coming from files or from an object store, but this is certainly a pattern that can be explored.

Second, SAP Data Intelligence has adapters to both consume (read) and produce (write) files in HDLFS.  SAP Data Intelligence requires certificates and keys for access to HDLFS. Fortunately, most of that has been completed in this blog.  SAP Data Intelligence requires a PFX/P12 version of the keys generated with OpenSSL.  To generate the necessary file, you can run this command:

openssl pkcs12 -export -inkey client.key -in client.crt -out keystore.p12


The resulting file, keystore.p12, is used within Data Intelligence as the trusted certificate into HDLFS. One other important note here is that when running the OpenSSL command, it will ask if you want to password protect the keystore.p12 file.  SAP Data Intelligence requires a password on this file, so please make sure to password protect it.