Skip to Content
Technical Articles
Author's profile photo David Stocker

Data Export API: Building a Python Wrapper (Part 1)

This is part of a series on exploring the SAP Analytics Cloud (SAC) data export API. When the series is complete, it will also be available as a single tutorial mission.

Now that we’ve taken a tour of SAC’s Model Export API, we can begin constructing a Python wrapper to consume it, in Python application, without having to manually orchestrate the REST API interaction. For reference, the completed API wrapper is available in the SAP Samples organization on Github. In general, you’ll find the code in the Github more defensively written and more verbose, while the code in the blog post itself is slimmed down to the minimum, to better convey the concept, without distraction.
In this post, we’ll lay out:
  1. The boundary conditions (scope) of our wrapper.
  2. Our broad approach to connecting to SAC, managing authentication, and extracting data
  3. Write out first lines of code to get the authentication token

The Scope and Boundary Conditions:

  • We will only support 2 legged OAuth, and not 3 legged. 3 legged requires a callback URL, which would require us to host a webapp, which can accommodate this. Since we’re building an API wrapper for Python scripts, which will be used in a command line environment, or possibly in a Jupyter notebook, we’ll stick to 2 legged.
  • We want to minimize dependencies, so we’ll only use the most necessary packages. In practice, this mean the oauthlib and requests_oauthlib packages, to handle the REST connectivity and OAuth authentication, respectively.
  • OData is very rich, regarding filter handling, lambda functions, etc. We could write a thousand lines of code for rich filter handling. We won’t do that, but instead cover the common (simple) use cases and allow hand crafted OData filter strings if the user wants something more complex.
  • We don’t need to touch every service endpoint. Namespaces is useless, at least for now and only there for formal OData compatibility. The provider service document gives us a subset of what we can extract from the provider OData EDMX document, so we’ll simply use the latter and skip the former.
  • We’re not trying to create something for long term (> 20 minutes) interactive support and are assuming that the script will run on shorter intervals, so we’ll skip renewing the OAuth token for now.

Our general strategy:

1. We’ll create a single object for managing the data connectivity to SAC. This object will acquire the session OAuth token and store it.
2. We’ll touch the Providers endpoint, to enumerate the available models and their provider IDs.
3. We’ll use the Master, Fact and Audit endpoints, to get the respective data.
4. We’ll add OData $filter support.

Prepping your environment

You can use a venv virtual environment, or a global environment as you please. Into this environment, you’ll need to install the oauthlib and requests_oauthlib packages.
pip install oauthlib
pip install requests_oauthlib

imports

We are going to requests_oauthlib to manage the HTTP requests and oauthlib to manage the OAuth authentication. We’ll also need two standard Python modules, json for handling json response payloads and minidom for parsing the EDMX’s XML structure.
Begin your Python code with the following four imports.
import json
from oauthlib.oauth2 import BackendApplicationClient
from requests_oauthlib import OAuth2Session
from xml.dom import minidom​

SACConnection class

Let’s start creating our SACConenction class. We’ll want to include a few things right from the start, in the class’ __init__() method.
  • We don’t want the user to have to go to copy and paste all of the different API endpoints in, because we can generate them. All we need to do this is the tenant ID and data center. We’ll generate endpoint URLs for getting the access token, fetching the list of available providers (models) and create the base URL for the provider specific service endpoints, whiuch we call providerRoot.
  • We want the user to be able to pass in the tenant ID and data center as variables. They can readily be obtained from the browser URL and the user does not need to go into any administration dialogs.
  • We were going to want to list providers. When we know the provider ID, we’ll want to access information such as the description and the service URL. When we don’t know the provider ID, we’ll need a data structure that we can use to look of the ID, if we know the descriptive name, or a portion. We’ll create empty dictionaries for these.
  • Once we parse the EDMX of a particular provider, we’ll want to store that information as well and we’ll create a dictionary for this.
class SACConnection(object):
    def __init__(self, tenantName, dataCenter):
        self.tenantName = tenantName
        self.dataCenter = dataCenter
        self.connectionNamespace = "sap"
        self.urlAccessToken = "https://" + tenantName + ".authentication." + dataCenter + ".hana.ondemand.com/oauth/token"
        self.urlProviders = "https://" + tenantName + "." + dataCenter + ".sapanalytics.cloud/api/v1/dataexport/administration/Namespaces(NamespaceID='sac')/Providers"
        self.urlProviderRoot = "https://" + tenantName + "." + dataCenter + ".sapanalytics.cloud/api/v1/dataexport/providers/sac"
        self.accessToken = None
        self.providers = {}
        self.providerLookup = {}
        self.modelMetadata = {}​

Getting the Access Token

To get the Access token, we’ll create an OAuthlib BackendApplicationClient object and an requests_oauthlib OAuth2Session object. Use OAuth2Session’s fetch_token method to acquire the OAuth token from the SAC tenant. We’ll call this method getAccessToken().
def getAccessToken(self, clientID, clientSecret):
    self.clientID = clientID
    self.clientSecret = clientSecret

    client = BackendApplicationClient(client_id=clientID)
    self.oauth = OAuth2Session(client=client)
    self.accessToken = self.oauth.fetch_token(token_url=self.urlAccessToken, client_id=clientID, client_secret=clientSecret)

Retrieving the list of models

We’ll retrieve our list of models from the Providers endpoint. For each of the providers (models) in the returned JSON, we’ll add an entry into the providers dictionary, in the SACConnection object, with the provide name, ID, description and the url of the provider’s service object.  We’ll begin by defining an SACProvider class, which we can use to store this data.
class SACProvider(object):
    def __init__(self, providerID, providerName, description, serviceURL):
        self.providerID = providerID
        self.providerName = providerName
        self.namespace = "sap"
        self.description = description
        self.serviceURL = serviceURL
def getProviders(self):
    response = self.oauth.get(self.urlProviders)
    responseJson = json.loads(response.text)
    for provData in responseJson["value"]:
        providerID = provData["ProviderID"]
        providerName = provData["ProviderName"]
        description = provData["Description"]
        serviceURL = provData["ServiceURL"]
        provider = SACProvider(providerID, providerName, description, serviceURL)

        #Add the provider
        self.providers[providerID] = provider

        #Add the provider to the lookup index.  The end user will have access to the providerName, but not the providerID.
        #providerName might not be unique, so be defensive about it...
        if providerName not in self.providerLookup:
            self.providerLookup[providerName] = providerID
        else:
            freeSlot = False
            nNth = 1
            while freeSlot == False:
                trialName = "%s (%s)" %(providerName, nNth)
                if trialName not in self.providerLookup:
                    self.providerLookup[trialName] = providerID
                    freeSlot = True
                else:
                    nNth = nNth + 1
While we’re at it, we can wrap the getAccessToken() and getProviders() methods, into a single call.  We’ll call this method connect().
def connect(self, clientID, clientSecret):
    #Wrapper to cut down on the number of commands needed to initiate a session
    self.getAccessToken(clientID, clientSecret)
    self.getProviders()
We can’t expect users to know the provider ID. They likely will know the name. We can add a small method to help find the ID, provided we know the name, or part of it. We’ll iterate over the providers and use Python’s built in find function, to compare the sought substring against the model name. We’ll then return a dict with the provider names as key, coupled with their corresponding provider IDs.
def searchProviders(self, searchstr):
#Use this method to look up a provider ID, if you know the name of the model
hits = {}
for provName in self.providerLookup.keys():
    if provName.find(searchstr) > -1:
        hits[provName] = self.providerLookup[provName]
return hits
Starting a session would then look like:
sac = SACConnection(<yourtenant>, <datacenter>)
sac.connect(<clientID>, <clientsecret>)
On my tenant, a have a few models with “TechEd2021” in the model name.  I’d search for them with:
hits = sac.searchProviders("TechEd2021")
which returns:
{
    'TechEd2021_NatParks_Demo2': 'Cdlvmnd17edjumrnshekknxo8w', 
    'TechEd2021_Demo1': 'C5f5s07ihms4ij9x3g40e2qdc0'
}
Next time, we’ll interrogate the model EDMX and build out the per model metadata.

Assigned Tags

      Be the first to leave a comment
      You must be Logged on to comment or reply to a post.