Skip to Content
Author's profile photo Robert Guglietti

HANA SDA – Google BigQuery

SAP HANA continues to build data bridges, the latest bridge in the the SDA family is Google BigQuery.  SAP HANA can now combine data from Google BigQuery, enabling data federation and/or data ingestion into the HANA platform.

 

 

The process to enable integration with Google BigQuery is simple.   During this blog I’ll walk through the process to build this new data bridge.

1 – Ensure your HANA platform is HANA 2.0 SPS03+

2 – Download the driver from google.  You can use the following link

3 – If not already done you need to download, install and configure the UnixODBC.  You can use the following link

4 – Configure a few driver settings as detailed here.  example of .odbc.ini file after configuration.

5 – Create a new remote source.  This can be done several ways.  The following SQL script will create a remote source to my GoogleBQ DSN that was defined in the .odbc.ini file.

CREATE REMOTE SOURCE BigQ ADAPTER “odbc”

    CONFIGURATION FILE ‘property_bq.ini’ CONFIGURATION ‘DSN=GoogleBQ’

6 – Once completed you should now see something similar to the following.(hana-200416 is my Google BigQuery project id)

7 – In order to query the data and combine it with other enterprise data in HANA we will import it as a catalog object.

7 – We will now combine the  Real-Time Call Detail HOT data from HANA with the COLD data stored in Google BigQuery.  Using a calculation view we can select from both HOT and COLD in parallel and UNIOIN the results together.

8 – First Lets have a look at the aggregated results from Google BigQuery.

9 – Now a look at the results from the HANA calculation view.  You can see a match with the Cold data output from the HANA calculation view and the native GBQ output.

 10 – An important consideration when using SDA is ensuring you do not return all the rows from a federated query to HANA.  Instead you want to push down data intensive operations to the source and return a limited number of rows to HANA.  In the example we are looking at there are 1000 rows in GBQ but we only need to return one row from GBQ as an aggregated sum of the fact data.  There are a couple was to validate your federated query is being pushed down.

10a – You can look at the smart data access log.   In the below example you can see in green our aggregation SQL is being pushed down to GBQ and as a result one row being returned.  In red you can see no aggregation and 200 rows being returned.  The volume here is small, however when working with millions or billions of rows it becomes extremely important to not move such large datasets to HANA VIA a real-time SQL query.  If you need to work with the large unaggregated dataset in HANA then you can consider moving it to HANA via SDI

10b – You can look at the HANA Visualize plan to see more details on the execution plan.  You can see in the below trace in green the SQL statement that has been pushed to GBQ for execution.  One row is returned to the Union from both the HOT and Cold sources.  This can then be consumed by the BI layer.

Conclusion: This new bridge into Google BigQuery via HANA SDA opens up even more possibilities and integration opportunities to simplify your data landscapes.

 

Assigned Tags

      6 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Mike Howles
      Mike Howles

      Interesting blog.  I was able to install the Google BQ drivers and configure my .odbc.ini file with a DSN sample, however I am not able to test it in either HANA Studio as a Remote source or in command line with isql.  Same error message in either case:

      hxeadm@hxehost:/usr/sap/HXE/home> isql -v GoogleBQ
      [08001][unixODBC][Simba][ODBC] (10380) Unable to establish connection with data source. Missing settings: {[RefreshToken]}
      [ISQL]ERROR: Could not SQLConnect
      hxeadm@hxehost:/usr/sap/HXE/home>

      Am I missing where this gets configured?

      Author's profile photo Roland Kramer
      Roland Kramer

      already heard about - SAP (Sybase) IQ – the hidden treasure …

      Best Regards Roland

      Author's profile photo Former Member
      Former Member

      I've got similar issue but after modifying the OAuthMechanism=0 (Service Authentication) I could connect via isql.

      example:

      [ODBC]
      Trace=no
      [ODBC Data Sources]
      BQ=Simba ODBC Driver for Google BigQuery 64-bit
      [BQ]
      Description=Simba ODBC Driver for Google BigQuery (64-bit) DSN
      Driver=<path_to_libgooglebigqueryodbc_sb64.so>
      Catalog=<GCP_project_name>
      SQLDialect=1
      OAuthMechanism=0
      Email=xxxxxx
      KeyFilePath=<path_to_p12_file>

      You can see the exact service account address in the IAM setting of GCP console and also get the p12 file.

      Author's profile photo Williams Ruter
      Williams Ruter

      Hello,

      While creating my remote source i have the following message :

      SAP DBTech JDBC: [403]: internal error: : Safe mode not enabled for BigQuery adapter. Please turn on safe mode before creating remote source BIGQ: line 1 col 22 (at pos 21)

      Have you faced this already ?

      Thanks

       

      Author's profile photo Rahul Navandar
      Rahul Navandar

      Hi Williams,

      I am facing a similar error. Were you able to resolve this and how?

      Thanks,

      Rahul

      Author's profile photo Williams Ruter
      Williams Ruter

      Hey Rahul,

      Yes i fixed it, you need to restart hana after you activate bigquery in safe mode, only after that you can create your remote connection.

      If you have any issue please let me know or email me on williams.ruter@kochasoft.com

      Best regard

      Williams