How to load CSV File from SAP BODS to Google Cloud Storage
Requirement: – Load data from a source database into comma separated csv file as target and upload this file into Google Cloud Storage (Bucket).
Resolution:-There are 3 components involved in this process: – Source Database, ETL tool (SAP BODS) and Target Google Cloud Storage.
All the settings are to be done in SAP BODS and Google Cloud Storage to perform this operation.
- Source Database:-
A data store is created for the source database (in this case it is Oracle database). The table is imported in the source data store.Below image shows the data store details, once connected import the source table and use the same in the data flow.
- Setting up Google Cloud Bucket in SAP BODS:-
- Go to File Locations in the Format tab of SAP Data Services in Local Object Library.
- Right Click on New.
3. Select Protocol as Google Cloud Storage.
4 Give a File Location Name and fill in the details for the configuration with Google Cloud Platform.
Below information would be required from Google Cloud Platform.
i) Project :- Google Cloud Project Name.
ii) Upload URL, Download URL :- These are default api url provided by google.
iii) Authentication Server URL:-Keep it default as it used oauth2.0 Web access service provider.
iv) Authentication Access Scope:-Can be Read Only, Read Write or Full Control. Depending on this requirement this can be set up accordingly.
v) Service Account Email Address:- To be taken from Google Cloud Platform.
vi) Service Account Private Key:- This is a PKCS#12-formatted private key and password is ‘notasecret’, each key is cryptographically unique.This should be downloaded from Google Cloud Platform. Rest of the parameters can be kept as default.
vii) File System:- Bucket:- To be taken from Google Cloud Platform.
viii) Local Directory:- Server Local Directory.Rest of parameters can be kept as default.
5.Create Flat files in the Flat Files tab and while setting up location setup as Location created in Step 4). In the below example it is set as Google Cloud Test and in File Names provide the file name to be created in the bucket.
- Google Cloud Platform:-
Go the google cloud platform console and go into the project. If a project is not created then create a new project.
- Open Cloud Platform Console Credentials Page.
- If it’s not already selected; select the project that you’re creating credentials for.
3.Since SAP BODS is an external tool it would use OAuth Client ID for credential creation.
4.Since SAP is not listed select Other and provide the Name for creation.
5.Next, it would ask for OAuth consent screen where email address is provided.
6. After saving the consent the Service Account keys are created.
7.For a new project or existing project go to project settings from the dashboard.
8. In project settings go to Service Accounts on the left side of IAM & admin.
9. Go to service accounts and click on create service account.
10. Select Furnish a new Private Key and enter the Service Account Name..
11. A private key is generated and choose whether to download the key as standard P12 file or as a JSON file that can be loaded by a Google API client library. Place it in a location where it can be accessed.Also note down the Email specified on the screen.
12.Make sure bucket is created for the project. Go to Storage and check the bucket name if not present create one using Create Bucket.
- Final Setup:-
Based on the above sections SAP BODS would require 5 parameters from Google Cloud to complete the setup and following can be mapped while creating the file location
- Project:- Google Project Name ( Step 8 from section Google Cloud Platform)
- Authentication Access Scope:- Select Full Control based on requirement
- Service Account Email Address:- (Step 11 from section Google Cloud Platform)
- Service Account Private Key:- Browse to the location where this key is downloaded (Step 11 from section 3)
- Bucket:- (Step 12 from section Google Cloud Platform)
Create a dataflow with source as table and create a flat file as target using above mentioned steps. Below is a simple dataflow that copies data from table DATA_SOURCE_CFG into a csv file in target named as GoogleCFGDATA.
Validate and execute the dataflow. Once the dataflow is executed successfully check Google Cloud Platform Storage bucket to see if the file is created there.
Very well explained. 🙂
Thanks a lot Rajan.
We are doing exactly what you have posted but ran into issue.
It appears that while writing files into Google Storage, Google is detecting that we are "Create Bucket".
Can you check to see if that's also happening to you?
This is giving us this error
If your application tries to use more than its limit, additional requests will fail. Throttle your client's requests, and/or use truncated exponential backoff.
Thank U for , I have a question please
is it possible to but the bucket name as parameter , I have tired to join AWS without setting the bucket as parameter with no errors ... but when i tried to set it as parameter I got errors in the connection
thank U Again
Nice blog. We write CSV files to Google Bucket like you explained. Just like in your example, files at our site are created with type application/octet-stream. If we transfer files with a utility called GSUTIL the type is text/csv. Is there a way to create the CSV files with type text/csv when we create file directly from BODS?