Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
shivamshukla12
Contributor

Hi All ,

Note: I was/am exploring the different kind of integration possibilities under the cloud & how can we provide the max automation in this Area hence this is a small piece of work I am presenting now & will keep exploring. Thanks.

I am writing this blogpost to share the new developed python library which is going to help us in migrating data from SAP HANA Data Lake to Big query , I am working on this problem since long that how we can establish a very smooth integration between HANA Data Lake & Big Query , but I am done now with the development & first release is ready for installation & also added my code in GitRepo.

Python Library hdltobq - https://pypi.org/project/hdltobq/

Source codehttps://github.com/shivamshukla12/dl2bq

 

A Simple Architecture:


 

Pre-requisites: You must have your btp trial account up and running & data lake instance should also be running & have your credentials also ready for an open database connectivity

You should also have your gcp trial account ready - & make sure you have downloaded the gcp credentials in json format locally in your system.

Mainly both the cloud accounts should be up and running.

 

Data Lake Instance:


 

GCP Instance & Big Query:


 

    • Now go to your python prompt and install the library



pip install hdltobq

    • If installation is successful then you will be able to import it
      import hdltobq​



 

    • After Installation try these imports if that's fine then all good to go
      ##Import below libraries...
      
      
      
      import hdltobq
      
      from hdltobq.hdltobq import BQConnect​



 

    • Methods for connecting to GCP BQ, Creating tables, Creating datasets & transporting contents
      Sample Inputs
      
      ###You should have your project & credentials ready for migrating data from Data Lake to BQ
      
      bq_dataset     = 'bigquery-public-data:hacker_news'    ## Your BQ Dataset if created else create one
      
      bq_project     = 'igneous-study-316208'             ### This is Mandatory
      
      bq_credentials = r'C:\Users\ABC\Downloads\igneous-study-316208-d66aebfd83ea.json' ##Mandt
      
      
      
      ##Initialize BQ
      
      bq =  BQConnect(bq_dataset,bq_project,bq_credentials)
      
      
      
      ##Initialize BQ
      
      bq =  BQConnect(bq_dataset,bq_project,bq_credentials)
      
      
      
      bq_client, bq_ds = BQConnect.connect2bq(bq)​



 

    • Create Dataset
      ###Create new Dataset for your tables first.
      
      lv_ab = BQConnect.create_dataset(bq_client,'HANADL')
      
      
      
      Output
      
      Creating DataSet.....
      
      Created.. Thanks



 

    • Create Table 

 


    • ### Create table ...
      
      BQConnect.create_tab(bq_client, df, 'HOTEL')
      
      
      
      Ouput:
      
      Started Creating table.....
      
      igneous-study-316208.HANADL.HOTEL
      
      Preparing Schema...
      
      Ready.....
      
      CRITICAL:root:Dataset igneous-study-316208.HANADL.HOTEL already exists, not creating.



 

    • Finally to transport data to BQ
      ####Command for BQ Insert
      
      df.to_gbq('HANADL.HOTEL',project_id=bq_client.project,if_exists='append')



 

          Data Preview from Data Lake



 

        GCP BQ Output



 

    • So here we come to an end where we have successfully transferred data from SAP HANA Data Lake to Big Query , Probably we will see the transfer from Big Query to SAP HANA Data Lake in next post - till then take care & Keep learning.



 

 PS :Finally I am adding a small demo video of my work . thanks




 

PS: Please don't forget to share your valuable feedback or any use case in mind for implementation or try.

6 Comments