Technical Articles
Connect Google BigQuery to SAP HANA Cloud as a Remote Source
In this blog, we will walk through the steps to connect Google BigQuery to SAP HANA Cloud as a remote source.
Before that in case you need to know what is SAP HANA Cloud, get access to free trial and try out some tutorials you can refer to the following blogs by Denys van Kempen
https://blogs.sap.com/2020/03/28/getting-started-with-sap-hana-cloud/
https://blogs.sap.com/2020/03/29/getting-started-with-sap-hana-cloud-part-ii/
SAP HANA Cloud follows Multi Cloud Strategy and as part of that strategy, one of the features is to be able to connect data sources from hypervisors. SAP HANA Cloud provides this ability via Smart Data Access and Smart Data Integration.
To connect to Google BigQuery, we will use the bigqueryrest SDA adapter.
You can also refer to the complete list of remote sources supported by SAP HANA Cloud in the following note https://launchpad.support.sap.com/#/notes/2600176
Pre-requisites:
SAP HANA Cloud:
- SAP HANA Cloud Trial instance is created and running
- User on the SAP HANA Cloud with following authorizations:
- System Privileges: CREATE REMOTE SOURCE, CERTIFICATE ADMIN, TRUST ADMIN
- Object Privileges: EXECUTE on Procedures SYSTEM.SET_PSE_PURPOSE and SYSTEM.UNSET_PSE_PURPOSE
You can use the below SQL code to create an user with the required authorizations.
CREATE USER RSADMIN PASSWORD <enter_password_here> NO FORCE_FIRST_PASSWORD_CHANGE SET USERGROUP DEFAULT;
GRANT CREATE REMOTE SOURCE TO RSADMIN;
GRANT CERTIFICATE ADMIN TO RSADMIN;
GRANT TRUST ADMIN TO RSADMIN;
GRANT EXECUTE ON SYSTEM.SET_PSE_PURPOSE TO RSADMIN;
GRANT EXECUTE ON SYSTEM.UNSET_PSE_PURPOSE TO RSADMIN;
GCP:
- GCP Project
- Service Account in GCP with roles BigQuery Data Viewer and BigQuery User
- Key is generated for the Service Account
- BigQuery DataSet and BigQuery Table
1. Launch SAP HANA Database Explorer from the SAP HANA Cloud Instance
Launch SAP HANA Database Explorer
2. SAP HANA Database Explorer is launched with the user DBADMIN. We can continue with the next steps with the same user which has all the needed authorizations. Or if you have created the user RSADMIN as described in the pre-requisites we can add database with RSDAMIN as the user by using the option Add Database with Different User which appears when we right click on existing database connection.
Add Database with Different User
3. Right-click on Remote Sources under Catalog from the selected database with user and select Add Remote Source to create a remote source which connects to Google BigQuery.
Add Remote Source
4. In the Add Remote Source screen,
- provide Source Name, for e.g., RS_BIGQUERY
- select Adapter Name as BIGQUERY (REST API)
- select Credentials Mode as Technical User
- provide Account Name, this should be the service account created in GCP which ends with iam.gserviceaccount.com
- provide Private Key, this key is available in the key file downloaded as JSON from GCP for the service account. (see point below)
Open the JSON file and locate the field “private_key”. The complete value within double quotes starting with —–BEGIN PRIVATE and ending with END PRIVATE KEY—–\n shoud be used as value for Private Key.
JSON Key File for Service User
Add Remote Source Screen
and click on Create. Now we can see that a remote source is created.
Remote Source Created
5. Now click on the Remote Source name RS_BIGQUERY to test it. We would notice that this remote source is not fetching any metadata from Google BigQuery. This is because REST API-based adapters communicate with the endpoint through an HTTP client. To ensure secure communication, the adapters require client certificates from the endpoint services.
Remote Source cannot fetch Metadata
6. To download the certificate, open the link https://www.googleapis.com/oauth2/v1/certs in your browser for example google chrome. Click on the small lock icon left to the url and then select certificate.
Open URL in Chrome
7. In the certificate screen, go to tab Details and then select Copy to File, this will launch Certificate Export Wizard.
Certificate Download
8. In the Certificate Export Wizard, click Next and then select the download format as Base-64 encoded X.509 (.CER), then click Next and select the path where you want to download the certificate and provide a name for the certificate and then click Finish. Now your certificate is downloaded.
Certificate Format
9. In this step, we will upload this certificate to SAP HANA Cloud and create a collection for this certificate and apply the purpose. Launch SAP HANA Cockpit from the SAP HANA Instance.
10. In SAP HANA Cockpit, select the display option Security and User Management and then under this display select the option Certificate Store.
Certificate Store
11. In the Certificate Store, select Import and browse the certificate which you downloaded previously and then click OK.
Import Certificate
12. Once the certificate is imported, we need to create a certificate collection. Click on Go to Certificate Collections.
Certificate Collections
13. In the Certificate Collections, click on Add Collection and provide a name for Certificate Collection and click OK.
Add Collection
Name for Certificate Collection
13. Now we need to add the certificate which we imported before to this certificate collection by using the option Add Certificate then select our certificate and click OK.
Add Certificate to Certificate Collection
Select Certificate
14. Finally we add the purpose for the certificate collection by using the option Edit Purpose. Select Purpose as Remote Source and select the Remote Source RS_BIGQUERY.
Edit Purpose
Select Remote Source
15. We go back to the SAP HANA Database Explorer, select Remote sources and click on Remote source RS_BIGQUERY. Now we can see that the Database and Schema is populated.
- Database refers to GCP Project Name
- Schema refers to GCP BigQuery Dataset Name
Select your Database and Schema and click on Search and now we can see the BigQuery tables from the relevant selection.
Remote Source displays Metadata
16. To create a virtual table, select the check box for the table and click on Create Virtual Object. In the pop up, select the schema where you want to create the virtual table and provide a name for virtual table and click Create.
Create Virtual Table
Provide details for Virtual Table
17. Locate the virtual table in the schema you selected and right click on it and select Open Data to view the data from the Google BigQuery Table.
Select data from Virtual Table
Data displayed from Google BigQuery Table
18. Google BigQuery REST adapter supports both virtual table and linked database. Since it supports linked database we can query data from Google BigQuery without creating a virtual table. We can use the following query to query the data without virtual table.
Linked Database
Conclusion: We are able to connect to Google BigQuery from SAP HANA Cloud and query the data without physically loading the data. Also an important point to note is there could be some data types which are not supported which might cause issues to retrieve the data.
References:
https://help.sap.com/viewer/f9c5015e72e04fffa14d7d4f7267d897/cloud/en-US/ea45d2993c1f45b09e221a2a9c497676.html?q=bigquery
It is a Very Good Article.
thank you.
Thanks for your information.
It's a very useful article. We are on SAP Hana Cloud (CF) version 2.00.052. (SP5), but there is not "bigqueryrest" adapter, i am a system user owner of the tenant. How can be enabled this adapter?
The adapter is enabled by default for SAP HANA Cloud. However, there is no SPS 05 release of SAP HANA Cloud.
Could it be you are using the SAP HANA service or a public cloud hosted version of the platform edition?
SAP HANA Cloud version is 4.00;
Good one, many thanks.
Question, is it possible to connect SAP Hana Cloud to a Google Public dataset, like bigquery-public-data? I do have access to those sets from BigQuery console or even using Python, bu they are not display in the Remote Source displays Metadata
Neelesh Kumar Jain: Very nice blog, thank you very much. Would you know, if its possible to use a proxy for the bigquery connection?
IMPORTANT:
If you do not CAREFULLY follow the Steps 6-8 to get the correct SSL Certificate you will run into this issue:
Error occurred while loading remote source information:
SQL Error: Unable to connect remote source: Authentication failure: SSL certifcate validation failed: SSL error {536872221}: Unknown error, General error: 0x2000051d | SAPCRYPTOLIB | SSL_connect
SSL API error
Failed to verify peer certificate. Peer not trusted
Subject:CN=upload.video.google.com
Issuer: CN=GTS CA 1C3, O=Google Trust Services LLC, C=US
...
Subject: CN=GTS CA 1C3, O=Google Trust Services LLC, C=US
Issuer: CN=GTS Root R1, O=Google Trust Services LLC, C=US
HI Neelesh Kumar Jain Very nice blog, thank you very much.
Would you know, if its possible to do the inverse ?
Not read data from Google BigQuery by Virtual Table but read table of Sap hana Cloud in Google BigQuery.
Thanks