Skip to Content
Technical Articles
Author's profile photo Sangeetha Krishnamoorthy

Querying SAP Data Warehouse Cloud from Amazon Athena using Amazon Athena Federated Query

This blog is part of technical resource for SAP TechEd session DT-200 : Amplify the Value of SAP Investments with Joint Reference Architectures

One of the major challenges businesses face is to figure out the best approach to combine the distributed data from internal and external sources. With the data federation-based architecture, businesses can bring together data from all the sources (without the need for duplication) to maximize value and draw insights from it

In our earlier blog , we saw how to federate queries from SAP Data Warehouse Cloud into Amazon Athena, thereby combining SAP & AWS data to build rich Analytical models for use with SAP Analytics Cloud dashboards.  

Now in this blog, we will see how this architecture applies in the other direction as well.  We will explain how Amazon Athena Federated queries can be used to seamlessly access SAP data by querying SAP Data Warehouse Cloud models (Illustrated by arrow 5 in the architecture diagram below.) 

 Bi-directional Data Federation  – arrows 4,5

This federated SAP data can now be combined with AWS data for dashboards or reporting applications in the AWS platform.  This architectural pattern is geared towards businesses that store massive amounts of data in AWS data lakes and can benefit from combining it with transactional data in SAP systems without the overhead of ETL processes. (For additional context, please read this AWS Big Data Blog) 

Solution Overview:

When a federated query is run, Amazon Athena identifies the parts of the query that should be routed to the data source connector and runs them with AWS Lambda. The data source connector makes the connection to the source, runs the query, and returns the results to Amazon Athena. If the data doesn’t fit into Lambda RAM runtime memory, it spills the data to Amazon S3 and is later accessed by Amazon Athena

Solution Overview Diagram

 

Steps for configuring query federation from Amazon Athena to SAP Data Warehouse Cloud:

Pre-requisites:

  • Access to an SAP Data warehouse Cloud tenant deployed on AWS infrastructure.  
  • Views in SAP Data Warehouse Cloud ready with data and exposed for consumption. 
  • Space admin access in SAP Data Warehouse Cloud to create database user.
  • Appropriate access for your AWS IAM user to be able to create secret, create Amazon S3 bucket, create and deploy lambda function, and full access to Amazon Athena.

Step 1 : Create a secret for the SAP DWC instance using AWS Secrets Manager.

  • On the Secrets Manager console, choose Secrets.
  • Choose Store a new secret.
  • For Secret type, select Other types of secret.
  • Enter the credentials for the database user of the SAP Data Warehouse Cloud as Key/Value pair. (See this document for help on how to create the database user in SAP Data Warehouse Cloud.)
    • Key = username, Value = <DB Username>
    • Key = password, Value = <DB User password>

Secret creation

  • For Secret name, enter a name for your secret. (Tip: Use a name with sapdwc in it, so it’s easy to find.)
  • Leave the remaining fields at their defaults and choose Next.
  • Complete your secret creation.

Step 2: Create an S3 bucket and subfolder for Lambda to use.

  • On the Amazon S3 console, create a new S3 bucket and subfolder for Lambda to use, such as below:

Configuring S3 spill bucket

Step 3: Configure Amazon Athena federation with the SAP Data Warehouse Cloud :

  • On the Amazon Athena console, choose Data sources in the navigation pane.
  • Choose Create data source.
  • For Choose a data source, search for the SAP HANA connector and choose Next.

Data source creation

  • For Data source name, provide a name such as “sapdwc”
  • Under Connection details, choose Create Lambda function.

Data source configuration 

  • For Application name, keep the default AthenaSaphanaConnector.
  • For SecretNamePrefix, enter the name which you created earlier.
  • For  SpillBucket, enter your S3 bucket name created earlier.
  • For JDBCConnectorConfig, use the format saphana://jdbc:sap://<dwc Hana host>:443/?${<secret-name>}.
  • For LambdaFunctionName, enter a name such as AthenaSapdwcConnector
  • For LambdaMemory, use the default 3008.
  • For LambdaTimeout, use the default 900.
  • For SecurityGroupID, enter the security group ID that is associated to the VPC ID corresponding to your subnet (a security group with default configuration should work fine).
  • For SpillPrefix, create a folder under the S3 bucket you created and specify the name (for example, “saphana”).
  • For Subnetids, enter the corresponding subnet that the Lambda function can use to access your data source. For example: subnet1, subnet2.
    Please refer the Appendix section to learn more about network setting.
  • Select theI acknowledge check box.
  • Choose Deploy.

The deployment takes 3–4 minutes to complete.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Deployed AWS Lambda function seen in connection details

  • After the function is created, choose the function name under Lambda function and keep the other options at their default.
  • Choose Next.
  • Choose Create Data Source.

The newly created data source can now be viewed in Amazon Athena’s Data sources view.

Amazon Athena Data Sources Tab view

Step4: Run federated queries with Amazon Athena.

  • In Athena Query editor, select the data source we created from the dropdown 
  • Select the database from the dropdown.  
  • If errors appear after selecting the data source in previous step, use the troubleshooting approaches as mentioned in the appendix of this AWS Big Data Blog. 
  • Tables from SAP Data Warehouse Cloud will be listed here in the Tables view.  
  • You can run your queries against this table or do a preview of the table data from the context menu of the table.  
  • The below picture shows data from SAP Data Warehouse Cloud federated live into Amazon Athena  

 

View in SAP Data Warehouse Cloud

SAP Data Warehouse Cloud data federated in Amazon Athena

 

Summary:

We saw how Amazon Athena federated queries can help query SAP data in Amazon Athena opening up possibilities for combining SAP data with data in AWS S3 data lakes for richer analytical applications.  

Together with our SAP Data Warehouse Cloud data federation architecture, we now thus have the capability of bi-directional query federation between SAP and AWS.  

 

Additional Resources:

https://docs.aws.amazon.com/athena/latest/ug/connectors-sap-hana.html

https://aws.amazon.com/blogs/big-data/improve-federated-queries-with-predicate-pushdown-in-amazon-athena/

https://github.com/awslabs/aws-athena-query-federation

 

Credits 

Many thanks to AWS team for their support and collaboration in validating this architecture – Yuva Athur, Ganesh Suryanarayanan, Krishnakumar Ramadoss, Sunny Patwari, Sabareesan Radhakrishnan and to Renga Sridharan and Scott Rigney for support and guidance. 

Thanks to SAP team members, for their contribution and collaboration for validating this architecture – Madankumar Pichamuthu, Karishma Kapur, Ran Bian, Sandesh Shinde, and to Sivakumar N, Anirban Majumdar for support and guidance. 

If you found this blog useful, please click the like button!  And if you have any feedback or questions, please post a comment below.  Or feel free to contact us directly atpaa@sap.com.    

Assigned Tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.