(Ab)Use Microsoft SSIS 2008 For SAP ETL Operations
Suppose You have a need for loading data into arbitrary tables in SAP (OLTP, not OLAP). Just for clarification: in this scenario it is not about using official external business interfaces (BAPI / Enterprise Services) and loading entire documents. It is about accessing tables directly for whatever need. For obvious reasons (like lack of all the business semantics, data validation and so on) it is not the most popular business scenario around but, nonetheless, worth thinking of.
We are going to process some data from external sources, possibly corrupt, possibly inconsistent, so we will be in dare need of a nice and shiny platfrom for data integration. Optimally a flexible and inexpensive one. How about Microsoft SSIS from MS Business Intelligence portfolio? It is a fairly justified assumption that we will be able to connect it to most thinkable data sources and integrate acquired data into something reasonable quite easily. But how about native SSIS to SAP connectivity?
…Not for the faint hearted
Strange enough, the way between Microsoft and SAP looks much more like virgin ground, really. There is a .NET Connector 2.0.1 available from SAP but it is built against .NET 2.0 and suited for Visual Studio 2003. If You try to obtain this version of VS .NET IDE, Your chances look rather bleak. So much for SAP adoption of .NET. There is also a SAP NetWeaver RFC library offering C/C++ interface but the API offered looks rather lowish. This is a refurbished version of the old good SAP RFC library with no backward compatibility. Not exactly a 10 minute job should You turn to be desperate enough to program against it.
On the other side of the fence there is a Microsoft BizTalk Adapter 3.0 for mySAP Business Suite, included in BizTalk Adapter Pack 2.0. That’s right, BizTalk. But it does include an ADO.NET Data Provider for mySAP, usable in any ADO.NET capable context, including SSIS. As it seems, Microsoft has no ambition to follow SAP changes as regards product naming convetions (mySAP has been a defunct name quite for a while now). The ADO.NET Data Provider for mySAP supports simple SELECT statements and can execute some RFC-enabled function modules. If You try to imagine a more limited set of functionalities, you will have a tough job. The merry news amidst darkness is, there is a version for MS SQL Server 2005 included in the license already. This preliminary version can be downloaded as part of the MS SQL Server 2005 Feature Pack (while the productive version that comes with MS BizTalk Adapter 3.0 for mySAP is priced separately).
How about INSERT support in ADO.NET?
In brief: There is nothing like this in ADO.NET Data Provider for MySAP. Obviously, You could think of programming Your own, generic RFC-enabled function module in SAP that would do that internally and then call it from the ADO.NET using EXEC syntax. But if You need to program something on the SAP side, in SE80, with all the TMS burden, entire solution loses its elegance. And we WANT the solution to be elegant, correct? After all, elegance may be one of very few universal criteria for judging on solution accuracy.
There is, however, an RFC-enabled function module RFC_ABAP_INSTALL_AND_RUN, which enables to compile and execute arbitrary code in SAP WAS remotely. Obviously You still require proper authorisations to do that (like authorisation to run SE38 and development rights) but an RFC-enabled SAP user with some rights is required for the story nonetheless. And, at least, no ABAP developer assistance is needed as the coding is sent to the server, executed and deleted straight away. There is a catch though…
How do I access my data?
Theoretically, all the data to be inserted into SAP tables could be sent over RFC together with the report source code, as a sort of all-in-one package. But do we really want to do this with large amount of data, is it not inefficient? Maybe there is a more performant, more elegant way to do this – like putting the data directly on the SAP WAS in a file, accessible from our generated ABAP report afterwards?
In the end of the day, our scenario looks as follows:
- Process source data in SSIS
- Upload data (e.g. as a flat file) to a location accessible from SAP WAS
- Upload generic ABAP report to SAP WAS using RFC and run it
Now it is time to get the hands dirty.
SSIS to go, please
Process source data in SSIS
In a real world scenario, the data would probably undergo some processing (replace / cleansing / validations and so on) first. To cut it short, we are going to use nice and shiny data straight from a MS SQL DB. The structure of the table should correspond with the structure of the target table in SAP, obviously, as we do not wish any more processing in SAP, just pure INSERTs.
Suppose we are going to load our data in E070. Well, not really, maybe ZE070 for a start, being a 1:1 copy of the E070. An MS SQL pendant for ZE070 would look like this:
Now, in MS SSIS 2008 we are going to access that data and put it on a SAP WAS-accessible file share. A simple Data Flow for that job might look like this:
I am purposefully omitting all the boring technical details in order not to disturb the storyline.
The interesting part is the Control Flow which should accomplish the following:
- Set up environent variables like file name, network share path (from the SAP WAS perspective), data separator not to forget SAP table name to insert.
- Run our Data Flow Task
- Send ABAP code over RFC and execute it. Gathering some messages or at least return codes afterwards would not do harm either.
A control flow to achieve all this would look like the following:
Setting up environment variables for ABAP code might look like this:
Then comes our Data Flow Task as depicted above.
So far we have been doing nothing really exciting but it is about to change now. The third step should be, in fact, an Execute SQL Task component, issuing nicely parametrised call to function module RFC_ABAP_INSTALL_AND_RUN. Unfortunately, to the best of my knowledge, the ADO.NET Data Provider for mySAP cannot be used in this way. Namely, parametrisation of an EXEC query (execution of an RFC-enabled function module in SAP) requires parameters of type Microsoft.Data.SAPClient.SAPParameter. This type of object, however, cannot be used in the Execute SQL Task component. There is a fairly remote possibility that I have overlooked some strikingly simple solution, but after an hour or so of playing with this I ended up with very unsatisfactory results. Entire documentation available on the subject focuses on the usage of the ADO.NET Data Provider for mySAP in the context of an ADO.NET Source component (within Data Flow Task) which is configured very differently so my expectations may reach too far (again). So, are we going to create another Data Flow Task and abuse ADO.NET Source component for such a plain purpose as issuing a relatively simple RFC call?
…And lived happily ever after
Yes, this story has a happy end. We are going to call the RFC programatically in a Script Task component, in a compact and elegant way. Below You will find the most important bits of code. Processing logic comprises the following steps:
- Acquiring SAPConnection object from package context
- Composing query parameters
- Firing the query
- Evaluation of the results
Below You will find a sample of the Script Task component code, which shows how the ABAP source is put together.
ABAP report which is going to insert the data is fully generic. All the specific bits are parametrised on the SSIS side (like file name with data to be imported, WAS directory and so on). In other words, the same Script Task component with no changes whatsoever can be used for loading data to any SAP table, thus effectively mimicking the missing ADO.NET INSERT.
It is worth noting that using ABAP file interface offers one specific, significant advantage out of the box. There is not need to load entire file in an internal table as the ABAP command READ DATASET loads exactly one row at a time (provided that the dataset has been opened IN TEXT MODE). Hence, a file of virtually any length can be processed with very little memory consumption. A nice feature, by all means.
Superfluous as it may seem, just to be on the safe side: this blog expresses solely my personal opinions on interoperability between some of SAP and Microsoft products. By no means does it reflect any official point of view of Capgemini Deutschland GmbH or any of its affiliates.