Data Extraction from Data Lake & Amazon Redshift U...

Technology Blogs by Members

Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!

Introduction :

In today's information management landscape it is increasingly important to have a standardized method of data integration or data ingestion as well as data extraction from desperate data sources.

SAP Data Services with its various in-built adapters and connectivity options comes up as an ideal tool to achieve the desired outcomes.

This article will outline how we can connect to a Data Lake in AWS Environment and extract data from the same to on-premise.

Main Part :

Pre-Requisites :

SAP Data Services 4.2

Amazon Redshift ODBC Driver

How To Implement The Solution :

Step 1 : Install Amazon Redshift ODBC driver locally and configure ODBC with the AWS Redshift database details and test connectivity.

Step 2 : Install the ODBC driver on the SAP Data Services Job Server and configure the DSN or ODBC with the same name and credentials as shared. It would be similar to Step 1

Step 3: Create a Datastore Type as "Database" with Database Type as "ODBC". Open the datastore and check "External Metadata". Please ensure that necessary permissions are provided at user level in AWS to access the database schema to be used.

Step 4 : Create a job with the AWS object as source and execute as below

Around 6 mins for a 1 million+ records for a 1:1 extraction without any transformations

Conclusion :

Thus we can see SAP Data Services can play a very important role in data integration with cloud solutions like data lake in AWS.

1 Comment

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.

Comment

Count

Data Extraction from Data Lake & Amazon Redshift Using SAP Data Services

SAP PI for Beginners

ABAP 7.40 Quick Reference

Fiori: technical installation and configuration of one app from A - Z