Load balancing Data Services 4.0 SP2 loading to SAP BW
Lately I’ve had four different customers loading to BW using Data Services. All of them use DS 4.0 SP2, and all of them use an application server group to balance the load among multiple BW app servers using a message server. I’d like to clarify how these customers achieved load balancing between these two systems.
[Note: These instructions are specific to Data Services 4.0 SP2 and SP3. They are not valid for DS 4.0 SP1 or below, nor are they valid for DS 4.1.]
[Note 2: If you landed here and want to learn about using a message server for load balancing between Data Services and SAP ECC or CRM, check out SAP Note 1469421 and Section 3.7.1.2 in the SAP Supplement.]
Problem: DS only connects to one BW app server
Each of my customers installed Data Services to talk to BW using only a specific application server. This is okay for a development or QA environment where you don’t necessarily care whether DS is overloading a single application server in your group.
Their installations looked very much like the following diagram:
Here’s a brief overview:
- There are 2 machines running DS. Each machine runs a job server, and they are configured in a DS server group for batch job load balancing. Each DS machine also has a Server Intelligence Agent (SIA) installed with either BI 4.0 or IPS 4.0. Each SIA runs a CMS. The DS Machine also runs the EIM Adaptive Processing Server (APS) in the SIA node.
- There are 4 machines running BW application servers.
- A DS administrator used the Management Console to configure an RFC Server. For instructions how to do this, check out Section 3.9 in the Management Console Guide and Sections 3.5.12.5-6 in the SAP Supplement. You should also read the detailed Data Services to BW tutorial by Pierre Erasmus.
Each of these customers launch DS jobs using Process Chains and InfoPackages in BW. This process flows like this:
- BW runs the InfoPackage and gets to the DTP that integrates data from Data Services into the BW DataSource.
- BW connects to the RFC Server hosted by the EIM APS. It runs a remote function call on the RFC Server. This function launches a batch job.
- The batch job is configured to run against a Server Group. At run-time, the batch job will be launched on the job server that has lower CPU and memory utilization.
- The batch job starts on one of the DS job servers. The engine reads the job definition and the required datastores from the DS local repository. The BW datastore contains connection information for a specific application server.
- The engine uses the application server connection information it found in the datastore to connect to a specific BW application server. This is an RFC client connection that uses the RFC library in $LINK_DIR/bin. (No, this connection does not use the RFC Server.)
- After a connection is established between the engine’s RFC client and the BW application server, the engine loads data packages to BW using BAPI function calls. (See why this doesn’t use the DS RFC Server?)
- After loading is complete, the batch job notifies the RFC Server. The RFC Server tells the BW application server that it is complete.
- The InfoPackage continues with the next steps to integrate the data.
In the configuration above, the RFC Server is only connected to one of the BW app servers (PBWMachine2.) When an InfoPackage runs, it will always use the RFC Server connected to PBWMachine2. If PBWMachine2 is down or network connectivity is interrupted between PDSMachine1 and PBWMachine2, the InfoPackage will fail because it cannot establish a connection to the RFC Server.
Similarly, only one application server is specified in the datastore. Before the load, the datastore tells the engine to only connect to one specific application server (PBWMachine2). All RFC client connections and BAPI calls are directed only to this server. Compared with having only one RFC Server connection, this situation is actually a much more significant problem since the BAPI loading calls use a significantly higher amount of computing resources.
Solution 1: Use sapnwrfc.ini to specify a message server
The first thing to do is let the engine know about the message server, thus allowing RFC Client connections to load balance among all BW application servers in the group.
There are no built-in datastore settings in DS 4.0 SP2 to tell the engine to connect to a message server. Why? Because there is already a way to do that. You just have to tell the RFC library to use a sapnwrfc.ini file. The RFC library reads the sapnwrfc.ini file and finds what message server should be used. Instead of using the application server hostname specified in the datastore, the RFC library connects to the message server and looks up the name of a more preferable application server from the group. The engine then establishes and maintains a direct connection to that application server.
This configuration looks like this:
The only things that have changed:
- A message server is running in the BW landscape, and a group named PUBLIC points to the application servers 1, 2, 3, and 4.
- The BW Target datastore was changed to set “Use SAPNWRFC.ini = YES” and to set the Destination Name = RFCDestinationA. (This destination name is found inside of the SAPNWRFC.ini file… that is why the Destination Name parameter is required when using SAPNWRFC.ini=YES.)
- A file named “sapnwrfc.ini” was created in $LINK_DIR/bin on each of the Data Services job server machines. This file must be identical between all DS job server machines. Instead of putting it in $LINK_DIR/bin, you may put it in any path and specify this path by environment variable RFC_INI.
For the purposes of loading data to BW using a message server, your sapnwrfc.ini file must contain the following syntax (see also: Netweaver reference for sapnwrfc.ini):
DEST=<destination ID>
R3NAME=<SID>
MSHOST=<hostname of message server>
PROGRAM_ID=<program ID — usually the same as the destination ID>
MSSERV=<optional – specifies a non-standard service name or port number>
GROUP=<optional – defaults to PUBLIC; otherwise, provide the group name of the application servers>
In this case, the file might look like this:
DEST=RFCDestinationA
R3NAME=PBW
MSHOST=PBWMS
PROGRAM_ID=RFCDestinationA
When a BW load occurs, the process flow now looks like this:
- [Unchanged] BW runs the InfoPackage and gets to the DTP that integrates data from Data Services into the BW DataSource.
- [Unchanged] BW connects to the RFC Server hosted by the EIM APS. It runs a remote function call on the RFC Server. This function launches a batch job.
- [Unchanged] The batch job is configured to run against a Server Group. At run-time, the batch job will be launched on the job server that has lower CPU and memory utilization.
- [Different!] The batch job starts on one of the DS job servers. The engine reads the job definition and the required datastores from the DS local repository. The BW datastore has sapnwrfc.ini enabled and specifies a destination name. The application server name is ignored.
- [Different!] The engine uses looks up the destination name in $LINK_DIR/bin/sapnwrfc.ini and finds the corresponding message server connection information. The engine contacts the message server and finds the connection information for an available application server. The application server connection is persisted for the remainder of the load — the message server is no longer used.
- [Unchanged] After a connection is established between the engine’s RFC client and the BW application server, the engine loads data packages to BW using BAPI function calls.
- [Unchanged] After loading is complete, the batch job notifies the RFC Server. The RFC Server tells the BW application server that it is complete.
- [Unchanged] The InfoPackage continues with the next steps to integrate the data.
The load is now more balanced because the application server identified in step (5.) is used for the intensive computing in step (6.).
Unfortunately, there is still a single point of failure. If PBWMachine2 connectivity is interrupted to the RFC Server machine, DS jobs cannot be launched and loads cannot complete successfully.
Solution 2: Add more RFC Servers
It would be nice to have the RFC Server connected to a message server, right? Unfortunately, you cannot do that with the current Netweaver RFC architecture. The DS RFC Server runs as a “registered server program” — this means that it must be associated with a specific application server. So you can’t just go into DS Management Console and hack the RFC Server config to point to a message server…nor can you set an RFC Server to use sapnwrfc.ini.
You can, however, create multiple RFC Servers, each of which connect to a different BW Application Server using a single program name. This was not possible in DS 4.0 SP1 and below, due to a specific bug (see SAP Note 1747494). But now in DS 4.0 SP2 and SP3, this gap is fixed! You can now have multiple RFC Servers, each one talking to a different app server using the same program ID.
In the following figure, the same landscape from the SAPNWRFC.ini example now has multiple RFC Servers configured:
This configuration allows an InfoPackage to choose any RFC Server running on the same Program ID. The process flow now looks like this:
- [Unchanged] BW runs the InfoPackage and gets to the DTP that integrates data from Data Services into the BW DataSource.
- [Different!] BW finds that there are multiple RFC Servers available on the Program ID. It picks a specific RFC Server and runs a remote function call on that specific RFC Server. This function launches a batch job.
- [Unchanged] The batch job is configured to run against a Server Group. At run-time, the batch job will be launched on the job server that has lower CPU and memory utilization.
- [Unchanged] The batch job starts on one of the DS job servers. The engine reads the job definition and the required datastores from the DS local repository. The BW datastore has sapnwrfc.ini enabled and specifies a destination name. The application server name is ignored.
- [Unchanged] The engine uses looks up the destination name in $LINK_DIR/bin/sapnwrfc.ini and finds the corresponding message server connection information. The engine contacts the message server and finds the connection information for an available application server. The application server connection is persisted for the remainder of the load — the message server is no longer used.
- [Unchanged] After a connection is established between the engine’s RFC client and the BW application server, the engine loads data packages to BW using BAPI function calls.
- [Unchanged] After loading is complete, the batch job notifies the RFC Server. The RFC Server tells the BW application server that it is complete. (Note — this is the same RFC Server that was chosen in step (2.) above.)
- [Unchanged] The InfoPackage continues with the next steps to integrate the data.
After these two changes, my customers can now trust that the data loading between Data Services and BW is more reliable and scalable. There are slightly more things to maintain in Management Console (more RFC Servers), on the job servers (need to ‘sync the sapnwrfc.ini files), and in the datastore (activate the SAPNWRFC.ini flag and maintain the destination name). In exchange, you are rewarded with a better, more scalable solution.
Let me know if these instructions work for you. If you see weird errors, search the KBase or file a SAP Support Message to component EIM-DS.
Good job thank you!
We have two BI 4.0 / BO DS instances (running on the same CMS) and two BW applications server. But I see my two RFC Server on both BO instance, it this normal? Because in the past we had some problems when we have two rfc server configured.
How can we configure for each BOXI instance only one RFC Server?
Hi Steven,
Not sure if I understand your landscape. You seem to say that you have already added multiple EIM APS services to your landscape (one on each SIA instance) and both of these APS's run an RFC Server. If you want, you can just go into the CMC under "Servers", and then remove the "RFC Server" from one of the EIM APS services.
However, this should not be necessary. See above when it talks about adding multiple RFC Servers connecting to multiple BW application servers?
Thanks,
~Scott
Hi Scot,
In your last slide (Slide6.jpg) I see seems you have different RFC servers on PDSMachine1/2. On my environment I have see the same two RFC servers on both DS service machines.
More details about our landscape:
Server1: CMS & Server2: CMS (both in the same cluster)
Server3: BODS & Server4: BODS (connected to same repository and connected to same CMS as above)
Depends on what you are asking.
If you only want the RFC servers running on one of your BOE processing tier machines, go to the CMC under Servers. You should probably see two EIM APS servers, one on each of your BOE nodes. Edit one of the EIM APS services and remove the RFC Server service from it. Restart that APS. Now the RFC Server service is only running on the other EIM APS.
If you have multiple RFC server instances pointing to multiple BW app servers, you must have configured them in Management Console -> Administrator -> SAP Connections -> RFC Servers. Remove the one that you don't want.
Still not sure why you want to do this, though. My framework above explains why you benefit from the redundancy of having multiple RFC Servers across multiple DS machines.
Hi Scott,
Really helpful blog post. Is the load balancing setup going to be the same for extracting from SAP BW?
As we have load balancing working from BODS to SAP BW (Clustered environment, multiple application servers), but we have issues in the open hub when sending data back to BODS.
in BODS we have an RFC server group connection to each of the 4 app servers and we have configured the sapnwrfc.ini file which load balances successfully back to SAP BW.
We have issues when sending the data from the open hub to BODS, the user status of the request is set to R.
Error 899 (RSBO):(OHSP)Data services RFC Server DS_SERVER@appservername
set the user status to R.
Is this because BODS has agreed to send the request to SAP BW over a specific server group RFC connection, but the sapnwrfc.ini file then says send it to the message server which load balances the requests in SAP, and the data comes back from SAP BW over a different RFC connection which BODS had previously agreed to use?
I can't seem to find any information about this? Have you come across this?
Many Thanks
Dan
We are also facing the same problem.
How does the registered program id on BW side look? It uses just one app server and gateway, correct?
I am using BW as source (open hub). as soon as the load balancer switches to app2 my job fails.
Tried will 2 RFC servers as well, but no luck.
e.g BW register program A has app1 and gw00
BODS rfc server 1 has program A app1 and gw00
now when it switches to app2 we get red status.
Please let us know what the problem could be
OpenHub is a special case -- no load balancing is possible. Only one RFC Server can be used.
This limit does not apply to loading data into a BW target.
Hi Scott,
according to notes
1747494 - Data Services and load balanced ECC and BW environment
when use Data Service 4.2 SP1 later, if no need to use the sapnwrfc.ini file?
please advise?
Thanks
Derek