Using SAP Data Services 4.2 we will walk through an example of consuming a REST web services via a Web Service REST Datastore.
To get a basic idea of what we will do, you can read through the SAP DS Integrator Guide, specifically the REST portions of these three sections:
Also the processes is essentially the same as consuming as SOAP web service for which Saurav Mitra has posted an excellent article over at dwbi.org
A summary of the steps we will take:
1. Procure the WADL file for the REST web service we want to consume
2. Create the Datastore and function from the WADL file
3. Create our Dataflow that will consume the function (REST web service)
The REST service we will use in our example is Google’s Geocoding API. I’m not advocating using this as a way to mass geocode addresses in Data Services, it simply is a nice open API to use for demoing the REST capabilities.
Here’s an example Call:
In a browser you can see the results as:
Let’s get started!
1. Procuring the WADL
This can be one of the most difficult part of this effort. Sometimes REST service publishers do make a WADL available and sometimes they do not. The WADL is what defines the REST service call for Data Services, similar to the WSDL for a SOAP service. You will very likely have to create your own WADL, or use the help of some tools to do so.
If you want to skip this step you can simply download the attached WADL file to your PC with DS Designer on it, and save the file without the .txt extension (GoogleGeocodeAPI.wadl). Note: if Google changes their API after this blog is posted, you may have to tweak or recreate these. Also, passing in different address types to Google’s API can yield different XML results, meaning the schema we are tying into here isn’t necessarily complete. For the sake of demo we will use some addresses that yield similar results
One free tool that can help us generate the WADL is SoapUI by Smart Bear.
If you download the free version of SoapUI you can
a. Create a new REST Project (File menu –> New REST Project)
b. Enter the service URI when prompted, and click OK
c. In your new project window, you can test the service by clicking the submit button and viewing the result pane.
d. To generate the WADL, right click on the service in the project navigator and choose “Export WADL”
Note: wherever you save it, it will be called “_1.wadl”
e. If you open the file in your favorite XML editor, you’ll notice that it contains information on the parameters to pass to the service (outlined in green), but no details about the response format (outlined in red). Note: the geocode api has more parameters than we show here, but because we only specified these parameters in our sample call in SoapUI, they are the only ones present in the generated WADL.
f. To address the missing response info we need to add a <grammars> section just after the first <doc /> tag
g. Within the grammars section we need to add the schema of the response. To do this, copy the XML Response from the SoapUI test we did, and paste into an XML to XSD generator. For example here is one.
h. Copy the schema output into the grammars section
i. Finally we have to tie the response to the schema. Add the following attributes to the response <representation /> tag:
j. Save the WADL file…. preferably as something other than _1.wadl
2. Creating the Datastore from the WADL file
a. Launch SAP Data Services Designer
b. On the Datastore tab of the Object Library view, right click on a blank area and choose New
c. Give your data store a name, select type: Web Service REST, and in the Web Service URI browse to the WADL file we created in step 1, and click OK to create.
If the create was successful, GREAT JOB! You will see your new Datastore listed. If it was not successful, there is something wrong with your WADL file, designer will indicate the line and column where it had the issue, try to find and correct the related issue and then try to recreate the Datastore. We worked through several xml validation issues until we got it right!
d. Next we need to import the function call. In the Datastore browser, double-click on the f(x) Functions node. This will open the Datastore explorer. Expand the tree until you see the XML node. Right-click the XML node and select Import. (note: this was another area where we had to work through some XML validation issues originally).
e. You should now see the function listed!
3. Creating a Dataflow that will consume the function (REST web service)
Next we will walk through a simple data flow where we consume a CSV file that contains our parameters for the service call (address and sensor), calls the service, and writes the the address along with geo coordinates out to another CSV.
CSV (address, sensor) –> REST service –> (address, latitude, longitude).
a. In designer we’ve created a test project with a batch job and a new dataflow.
b. Next we’ll create a test CSV file to use as the source of our parameters to the REST call. The file as two columns, address and sensor. A copy of this file is attached (googleresttest.txt).
address, sensor "1600 Amphitheatre Parkway, Mountain View, CA",false "3999 West Chester Pike, Newtown Square, PA",false
c. To use our file we’ll create a new flat file format. Here are the settings used to create it
d. Add the Flat file as a source to our data flow, and then add a query in which we will call the function module, and link the two together.
e. Inside the Query we add the function call by right clicking the target schema and choosing “new function call…”
f. Then we select our function
g. Next we see the parameter mapping screen and notice we are unable to map because of the nested structure that the function takes. We’ll take care of that in the next step, just hit finish for now.
You should see this in your target schema of the query.
h. Let’s go back and add an additional query step in between our flat file and our REST call. We will use this to structure, or nest, the data in the format needed by the function. We also need to add a vanilla Row Generator feeding into the nest query, without which we would end up with only one address getting geocoded.
i. In our nest query, right click on the target schema, and choose “New Output Schema…”, and give fill in a new schema name.
j. in the target schema, make sure the PARAMETERS sub-schema is select, then in the source schema, select the address and sensor fields, right click and choose map to output.
k. Next with PARAMETERS still selected ad the Schema Out, we need to hook in the row generator by checking the Row_Generation in the From tab, and unchecking our file format.
l. Now we go back to the REST function call query, where we can see our new incoming structure. Next right click on the function call and choose “Modify Function Call…”. Click on the new PARAMETERS sub-schema and drag it down to the URI_Parameters box. Then click finish.
m. Now let’s add a new Query after the REST function query to unnest the results. Map the GeocodeResonse to the output, then right click the QRY_UNEST and choose “Unnest with Sub-shemas”
n. Now we’ll add another query after the unnest, and map the columns we want to dump to a file. However after you map the columns, you’ll notice if we run a validation that it comes up with an error about each column. I’m not sure if this is a bug in how DS maps unnested columns, but you can correct by removing the original subschema names from the new mapping. The other thing we need to do is mark the select “distinct rows” option because of the format of the response, the unnesting can result in multiple rows per address.
o. Lastly we can take the result and dump it out to a file, by generating a file format from our final query, and making it a target.
p. Now if we execute our test, the resulting file contains:
formatted_address,lat,lng "1600 Amphitheatre Pkwy, Mountain View, CA 94043, USA",37.4224607000,-122.0842702000 "3999 West Chester Pike, Newtown Square, PA 19073, USA",39.9880211000,-75.4153618000
We hope this example proves useful. Again, the Google geocode API has many variations that are not all accounted for in this example, so if you try other addresses, don’t be surprised if it breaks due to the different elements in the response. Also, there are probably other better ways to mass geocode addresses, again we are just using it as an easy example. Thanks!