Connecting Google Cloud Storage with SAP DataIntelligence & Python
The blog assumes that you have access to a Google Cloud account and have created a project and a DataIntelligence 3.1 or higher instance.
- Login to your Google Cloud Storage Account
- Navigate to IAM – Service Accounts
3. For your project this shows the existing service accounts. If you do not have a service account. Create one.
4. Choose a service account with existing keys or create a key using
5. Download the key in Json format on your local machine. This can be done by clicking on ‘Add key’ Make a note of the project name
Now invoke Connection Management from SAP DI
Create a new connection of type Google Cloud Storage
The project id should exactly match the project id from GCP
Provide the key file downloaded in json format
For more details refer help
Test the Connection!
You should also be able to browse this connection using the DataIntelligence Metadata Explorer and view the bucket contents.
Connecting GCS using Python
Using ML Scenario manager and notebooks you can connect to GCS using Python Notebook
Upload the json key file to DataIntelligence. Here we uploaded it to /vrep/mlup
Type in the following code
from gcloud import storage from oauth2client.service_account import ServiceAccountCredentials import os import json from google.oauth2 import service_account project_id = 'sap-digitalman*****' bucket_name = 'ifn***' with open('/vrep/mlup/sap-digitalmanu*****.json') as json_file: credentials_dict = json.load(json_file) #credentials_dict = json.load('/vrep/mlup/sap-digitalmanu*****.json') credentials = ServiceAccountCredentials.from_json_keyfile_dict(credentials_dict) client = storage.Client(credentials=credentials, project='project_id') bucket = client.get_bucket(bucket_name) blobs = bucket.list_blobs() for blob in blobs: print(blob.name)
you may have to install
pip install –upgrade google-cloud-storage
pip install –upgrade gcloud
in case of missing modules
The output should list the bucket contents.