Extracting SAP data from your on-premise SAP system with Amazon AppFlow SAP OData Connector
SAP systems form a major part of most customers landscape. Most of the SAP systems hold critical data and cannot be open to public internet. They need to be accessed using Secure connections.
Customers are constantly looking to get more value from their SAP data by leveraging data and analytics capabilities on the cloud. Customers want to build data lakes on cloud, for e.g, Amazon Web Services (AWS) and combine SAP data with non-SAP data to gain insights previously not possible. By doing this, Customers are able to take advantage of advanced data and analytics services of cloud providers, regardless of where the SAP system resides. Hence, a secure connection becomes very important.
In one such use-case, we wanted to extract the master data from SAP system. The Amazon AppFlow SAP OData Connector, released in Q3 2021, is one such tool that allows customers to extract data from SAP ERP/ SAP HANA / SAP BW systems which are on-premises or in any cloud other than AWS. Data is then stored in Amazon S3 storage to be combined with non-SAP data and/or analyzed by Machine Learning algorithms.
In this blog we will walk through the configuration steps required to setup Amazon AppFlow SAP OData Connection from an SAP system that does not run on AWS or is an on-premise system.
SAP systems are usually protected systems and not open to internet. If the system is behind a VPN or on a different Cloud (non-AWS) additional configurations need to be done to enable the connectivity through AWS PrivateLink, which is an encrypted connectivity set up required by Amazon AppFlow.
In the blog, when we refer to source system we are referring to Amazon AppFlow (since it will be initiating the request) and for target system we will be referring to the on-premises SAP system (that needs to be connected to for data extraction).
Below are the high-level steps that need to be performed
- Setting up AWS VPC, Subnets and Route Table Configurations.
- Site-to-Site VPN Connectivity set up
- Target Groups and Network Load Balancer set up
- Public Hosted Zone in Route 53 with validation for domain name, VPC Endpoint, AWS Certificate Manager (ACM) Certificate
- SSL certificates set up – generated through ACM
- VPC Endpoint service configuration
- Calling the SAP ODATA service through Amazon AppFlow
For this walkthrough, you should have the following prerequisites and access:
- An AWS account
- VPC Console Access – To create subnets, VPC end-point services, Site-to-Site VPN connection, Customer Gateway, Virtual Private Gateway
- EC2 Console Access – To create Network Load Balancer and Target Groups
- Route 53 Access – To create Hosted Zone and route to right Load Balancer, Validate Endpoint Service and SSL Certificate.
- AWS Certificate Manager – To Provision certificate for SSL Connection
- AWS S3 – All bucket related access
- On-Premises SAP system user access with authorization to use ODATA services.
- From the target end, VPN Tunnel IPs, Ports and the Internal IP of the server SAP system is hosted on.
Step 1: Setting up AWS VPC, Subnets and Route Table configurations
Let’s begin by creating the Virtual Private Cloud (VPC). AWS provides VPC to keep your resources inside the same VPC network. We need to create VPC network where we will keep our EC2 instances and other components for communication with AWS. Tag the VPC with appropriate name so that it is easy to identify.
You can use the “Create VPC” Wizard from the VPC Dashboard on the AWS console to create the subnets and route tables in a visualized way. Follow this guide
- Ensure you create subnets for at least 50% of the Availability Zones (AZ), this is to ensure our network load balancer spans at least 50% of AZ for Amazon AppFlow to work.
- Ensure DNS Hostname for VPC. A DNS hostname is a name that uniquely and absolutely names a computer; it’s composed of a host name and a domain name. DNS servers resolve DNS hostnames to their corresponding IP addresses. We must have to enable DNS hostname to use internet gateway or other gateways.
Once the VPC and its components are created, create and add an Internet Gateway for the VPC and attach it in the route table, so all the routes are propagated correctly. We need to update the route table, created above, so that it should also understand the internet gateway address and allow them.
With the VPC setup completed, the next is to set up the AWS Site-to-Site VPN, which consists of a number of steps.
Step 2: Setting up the AWS Site to Site VPN.
When it comes to setting up the VPN follow the AWS VPN Site-to-Site user guide. This guide walks you through the key steps from:
- Create the customer gateway
- Create a target gateway(Virtual Private Gateway)
- Configure routing
- Creating a Site-to-Site VPN Connection
- Configure the customer gateway device (Download the configuration file)
In our case, due to High Availability VPN config, we have 2 External IPs, hence 2 customer gateways were configured. Ensure to Tag the customer VPNs for easy identification.
Once both sides are aligned, the Tunnels should show as “Up”. In case they show “Down”, before proceeding with any further steps, the configuration of the tunnels in Site – to – Site VPN Connection need to be revisited and aligned with the Target side, so that they show “Up”.
Step 3: Target Groups and Network Load Balancer Set up
In this step, Target groups and Network Load balancer (NLB) must be configured and attached to the VPC. This step is necessary to listen to requests and route to the VPN tunnels appropriately.
To create Target Group, the internal IP of the server hosting the on-premises SAP system is needed. Also, we need to know the port through which the server accepts the requests.
When Creating the Target Group choose target type as “IP Addresses”, provide a Target Group Name and provide the Transport Protocol and port (port open to receive requests).
Once the Target group is configured, the status will be “Unassigned”. This will change once the Network Load Balancer (NLB) is attached to the Target Group.
In the EC2 console, create Network Load Balancer.
- Give the Name of the Load Balancer,
- choose scheme as “Internal” as we are going to use it only within the VPC,
- Choose the VPC created in Step1 and assign all the Subnets.
- The NLB should be available in at-least 50% of the AZ for the region to work with Amazon AppFlow.
Under the Listener, select the protocol as “TCP” and the port as “443”. Choose the target group created above and create the Load Balancer. Further listeners can be added later.
Once created, the NLB will look similar to this:
The Target group will be in assigned status and if the connection through Tunnels created on VPN is up, then the Health Check of the Internal IP will show “Healthy”
Step 4: Public Hosted Zone and Route53 configurations
The next step, configuration of Amazon Route 53 Public Hosted Zone is crucial. This acts as a bridge between AppFlow and all other components and is necessary to route the requests correctly.
Route 53 is a separate console, and a valid domain name is needed to create the entry. If there is no valid domain name or you don’t have access to DNS record to validate / approve changes, then a domain name can be easily registered on Route 53 console.
To create a hosted zone for an existing domain name, enter the domain name in the field. In the descriptions, additional names for the same domain name can be entered. Clicking on create, will create the “Publicly” hosted zone.
Once this is created, the zone will have 2 records “NS” and “SOA”. Additionally, records need to be added to the Zone.
The first record that needs to be added is Network Load Balancer. To do that, click on “Create Record”.
Choose the record type as “A” and keep the record name blank. Provide the value as “DNS Record Name” of the Network Load Balancer and create the record.
The other 2 records for VPC endpoint service and Certificate Manager will be created in subsequent steps in similar way.
Step 5: SSL certificate generation through AWS Certificate Manager (ACM) and Route 53 setup.
The connectivity through AWS PrivateLink needs the flow to be encrypted using SSL certificates. Without this, the connection will not be successful.
The certificates can be created on AWS Certificate Manager (ACM) and assigned to the flow or externally created certificates can be imported in ACM and used for this purpose. These certificates must be trusted certificates from 3 party. Self-signed and wildcard certificates will not work.
In this step, we will be provisioning certificate for the domain name given in Public Hosted zone (above step) and validating the certificate by adding the record in Route 53.
Open the ACM and Click on “Request” and choose “Request a Public certificate”.
The next page will show the certificate details and the status will be in “Pending Validation”.
Open Route 53 Public Hosted Zone and click on “Create Record”. The Certificate Manager expects a record of type CNAME. So, choose Record Type as “CNAME”.
In the “Name” field, copy the value under CNAME name without the domain name extension.
In the “Value” field, copy the value under CNAME value and save the record.
After the record is saved, refreshing the page in AWS certificate Manager will set the status of the certificate as “Issued” and it is ready to use in the flow.
Step 6: VPC Endpoint service configuration and Route 53 set up.
The last component that needs to be created to ensure complete flow is VPC end point. This is the service that is called by Amazon AppFlow to encrypt and send the call through Private Link.
Go to VPC Console and go to VPC Endpoint services. Click on “Create endpoint service”.
Give a name and choose Network Load Balancer. List of available load balancers in the region is shown, choose the one that was created in step 5. Keep rest defaults and create the service.
Once it is provisioned, go to “Allow Principals” tab. On this tab, click on “Allow Principals”, and add “appflow.amazonaws.com” as the principal.
Next click on tab Endpoint Connections, choose the connection and click on “Accept”. This will make the connection to “Available”.
Go back to the Network load Balancer and edit the listener to attach the certificate created in ACM.
Add the endpoint service to the Route 53 record set. As described in above steps, add a new record, the type of the record will be “TXT”, the name of the record is the “Domain Verification name” and the value is “Domain verification Value”.
Once added Route 53 will all record sets would look as below:
Step 7: Creating the connection and calling the ODATA Service through AppFlow.
This step brings all the above configuration together.
Go to Amazon AppFlow and go to “Connections”. Select the connector type “SAP ODATA” and click “Create Connection”
- Enter the “Application Host URL” of the SAP system from which the data must be pulled.
- Enter the Application service as “/sap/opu/odata/iwfnd/catalogservice;v=2”
- Enter the port Number as “443”
- Enter valid SAP system Client Number, Logon Language, User Name and Password.
- In the Private Link radio button, choose “Enabled”, and copy the service name of VPC endpoint in this field, highlighted in step 8.
Provide a name to the connection and click on continue.
The connection should get created successfully.
In Summary, many SAP systems are on-premises or even hosted on another cloud platform other than AWS, the connectivity to such systems always requires a secure connection as they reside in different networks. In such cases, Private and encrypted connection using the above services from AWS can be set up to securely extract data from SAP. This ensures that the SAP data is safely accessed and protects the system from being opened to Internet, there by reducing the security risk.
I hope that this blog post has been useful. I welcome any feedback you may have, even if it is to correct where I may have erred as I am sure there are experts that have more experience in this area.
There are other useful blogs in the AWS topic area: https://community.sap.com/search/?ct=blog&q=AWS
You may also post queries and questions here: https://community.sap.com/search/?ct=qa&q=AWS
Thanks for reading.
For all customers who don't have SAP systems on the cloud this is a fantastic guide on how to move your data to AWS and benefit for unlimited extension opportunities. Keep up the good work Rashmi Joshi