Skip to Content
Technical Articles

Leveraging Load Balancers with SAP IQ

OVERVIEW

There are significant challenges to load balancing DBMS systems. First and foremost, a DBMS will need to have access to all of the data for a particular table, which implies that the database is updated directly. DBMSs also require full ACID compliance and transactional integrity.  As data is modified, the DBMS must guarantee that all of the changes relevant to a transaction complete, or else the entire transaction doesn’t complete.  When load balancing across a DBMS cluster, you must guarantee that all database nodes have access to all data for all tables.

 

SAP IQ was designed so that these traditional limitations are not a factor for proper operations and for data integrity.

 

SAP IQ supports a cluster implementation called a multiplex.  In an SAP IQ multiplex, CPU and memory operate in a shared nothing mode.  The main storage, user data, is shared across all nodes.  This is a blend between a shared nothing and shared everything cluster deployment.

 

When a change is made to data, SAP IQ guarantees that the data is immediately visible to all nodes upon a successful completion (commit) of the transaction.  In the SAP IQ shared architecture the data is exposed to all nodes simultaneously.

 

When clusters are deployed, a server that encounters problems goes offline and is unavailable.  The SAP IQ architecture, though, guarantees that the data is always visible on all nodes so that the loss is only in compute power, not in data availability.

 

Applications that were connected and running on the affected node will immediately fail.  They can be designed with logic in place to handle such an outage.  As the number of applications and 3rd party tools increase, though, this can cause a significant amount of application rework to happen with some systems unable to be changed (3rd party tools) in a timely manner for the business.

 

Enter the TCP/IP based load balancer.

 

A load balancer is a hardware or software device that handles and routes network traffic between clients (database application) and servers (database servers or nodes).  This device presents a “virtual server” address to the outside world.  When users attempt to connect, the connection is then forwarded to the most appropriate real server.  The architecture and layout of the “servers” is shielded from the client, allowing for agility and greater flexibility to change the server processing architecture without having to change the application.

 

At a high level, the process that an application goes to connect to SAP would be:

  • The client application(s) has access only to the virtual IP or virtual hostname of the load balancer
  • The client sends a request to SAP IQ through the load balancer
  • The load balancer forwards the request to one of the SAP IQ nodes, based on the configuration of the load balancer
  • The request is then received by SAP IQ and processed. SQL statements, for that connection, are sent to the same SAP IQ node for processing.  The connection remains in effect until the client application closes the connection.

 

This process is no different than if the application were to connect directly to an SAP IQ node.  The load balancer is there to interact at a network layer, thus avoiding the need for application changes.  Any TCP/IP based application can reap the benefits of the load balancer.

With an IP load balancer standing between the client application(s) and the databases, acting as a full TCP proxy with knowledge of the state of servers, connections can be reset immediately upon attempting to communicate with a downed server. This can happen when a server goes down in the middle of a communications stream. The load balancer marks the database as down, and when the next request comes from the client, the load balancer resets the connection, forcing the application to a different node upon return.

 

When DBAs and system administrators need to perform maintenance, connections can be gracefully bled off of a single node until there are no more active connections. There is no need to perform a wholesale kill of all connections to a node, but rather mark the node as active but not accepting new database connections.  This allows the existing connections to complete in-flight work and to migrate, over time, to other nodes.   Once the database or system maintenance activities are complete, the administrator can then return the server to the active pool so that the load balancer can resume sending connections to the server.

 

For the purposes of this document, hardware and software load balancers will be the main focus.  There are alternatives that will be lightly discussed towards the end.

 

The distinction between “hardware” and “software” load balancers is becoming less and less meaningful.  A hardware load balancer is a piece of purpose built network equipment with PC class CPUs, network interfaces with packet processing capabilities, and some software to bind it all together.  A software load balancer is one that is built on open, commodity based CPUs and network interface cards (NICs).  The software to bind it, however, is one that the enterprise can choose based on the necessary capabilities to meet the business needs.  With a hardware load balancer, your setup time, configuration time, and maintenance are reduced as you are paying a premium for your network vendor to do this.  A software load balancer is owned and maintained by you, but set to your specific needs.

 

Both work equally well with SAP IQ.

 

For a list of the most commonly used software based load balancers, follow this link: https://www.inlab.de/articles/free-and-open-source-load-balancing-software-and-projects.html

LOAD BALANCING WRITES

Typically, data loading is isolated from queries so that loads do not degrade query performance and queries do not degrade load performance.  Data manipulation and loading should not go through a load balancer as we will lose the fine grained control of data loading and specific node placement of the load workload.

 

All data manipulation jobs (load, delete, update, ETL, ELT, etc.) should interact directly with the node, or nodes, that they require.  In order to meet the performance, latency, and SLA goals the jobs must interact directly with IQ.  Lastly, there are load options that require specialized syntax that can only be done on certain node types.  This is not possible with a load balancer acting as an abstraction layer between the load client and the SAP IQ node.

LOAD BALANCING QUERIES AND READ ACTIVITY

A load balancer is, generally, used with SAP IQ to isolate query workload or to distribute the query workload across multiple SAP IQ servers.  SAP IQ has features that can perform aspects of workload distribution.  They do not, however, fit every situation.

 

SAP IQ uses a feature called logical servers.  Logical servers provide resource provisioning for the SAP IQ multiplex by grouping a subset of computing resources (multiplex nodes) as a logical entity.  When you connect to a logical server and run a query, SAP IQ attempts to distribute the query execution to those nodes that are members of the logical server.  Not all queries can be run in a distributed fashion.  This will force users to a single machine for execution.  Logical servers do provide a way to move connections to a lesser used machine. If users connect to the same node, all connections that can run on that node continue to run on that node.  It is only when the node is saturated that a connection is moved to other nodes in the logical server.

 

The net of this architecture is that there is no built in load balancer, round robin or otherwise, that will attempt to evenly distribute the workload of multiple connections across different compute nodes.

 

A load balancer adds such a capability to SAP IQ.

Round Robin (simple or next in loop) Load Balancing

At their most basic, load balancers can simply apply round robin logic to distribute the workload across the SAP IQ servers.

 

Take the simple diagram below.  When user 1 connects to SAP IQ, it will be directed to node 1.  User 2 will be routed to node 2.  User 3 will be routed to node 3 and so on.

This type of distribution works for most use cases.  All query activity is directed to the virtual IP/hostname of the load balancer while all write activity is direct to each of the dedicated write nodes.

 

A traditional round-robin, or next in loop, implementation has the downside that all users, applications, queries, and connections are treated equally so that it is possible for two heavy duty queries to run on the same node while other nodes are not being used.  Again, this is not much of an issue for most SAP IQ customers due to the optimizer and how resources are allocated.  In SAP IQ, a single query on a node can consume all resources should the optimizer deem it necessary.  When a second query attaches to the node to execute, the runtime engine quickly shifts resources between both queries so that both share the entire system with equal priority.  When a query finishes, those resources are then distributed across the remaining jobs.  The goal is to keep all CPUs active across as many or as few jobs as are in the system at any point in time and do it dynamically.

 

Those that need finer grained control should look to the next few alternatives to be discussed.

 

There are other algorithms that enhance a simple round robin scheme.  Check the specifications of your hardware or software based load balancer for the specific algorithms supported.

 

The algorithms that are most prevalent with SAP IQ include:

  • Weighted Round Robin — Some servers get a larger share of the overall traffic. Quite useful if your SAP IQ nodes have different CPU and memory specifications.
  • Least connections — The load balancer monitors the number of open connections for each node, and sends to the least busy node.
  • Least traffic — The load balancer monitors the bitrate from each server, and sends the next request to the node that has the least network traffic.
  • Least latency — A simple request broadcast to the backend servers with the new connection going to the first server to answer.

Application Dedicated Resources

In addition to round robin load balancing, customers have asked for an architecture that allows them to isolate the query workload based on application or functional community.  For instance, finance need X dedicated resources while marketing needs Y.  They use the same data, but want isolated compute power and memory.

 

To accomplish this with a load balancer, we simply need to define multiple virtual IP/hosts on the load balancer.  For instance, finance_host and marketing_host.  Via load balancer features and commands, each of these virtual hosts would then be mapped to a distinctly different set of IQ nodes.  The commands and syntax vary, but every load balancer provides for this type of mapping.

 

In the following diagram, the finance users (red lines and boxes) connect to their dedicated finance_host virtual system.  These connections are then routed to the two IQ nodes that are dedicated to that organization.  The marketing users (blue lines and boxes) connect to their dedicated marketing_host virtual system.  These connections are then routed to the two IQ nodes that are dedicated to that organization.

 

This framework can be expanded to include as many discrete application functional areas as needed for the business.

 

This same architecture can be expanded further to handle the situation where groups need to share resources.  Take the following diagram that builds upon the one above.  Now we€™ve added two more groups (bizdev and c-level). In the diagram below, both are denoted by the green lines and green box around the hosts.  These two apps are now mapped to a host from the Finance group and a host from the Marketing group.

 

The purpose is to show that any combination of client mapping to host mapping is possible and that SAP IQ node groupings can overlap one another.

Logical Servers and Load Balancing

As previously mentioned, SAP IQ supports the notion of logical servers that process query workload.  The logical servers can have dedicated hardware or share hardware with other logical servers.

 

When using a load balancer with SAP IQ, it is important to define a single logical server for query activity.  This logical server will guarantee that queries only run on a subset of compute nodes, thus isolating queries from write activity.  As the resources for logical servers is dynamic, nodes can be added or removed at various points in time to increase or decrease the compute capacity for query jobs.  This logical server should be confined to just those nodes that will query data and eliminate nodes that will be used to load or change data.  At a high level this will isolate the workloads so that loads and queries do not vie for the same resources.

 

Often times, the use of multiple logical servers with a load balancer can lead to an overly complex environment with both the load balancer and IQ trying to direct workload to specific machines.  While it is not advised to use more than one query logical server, it is possible.

 

If we refer to the diagram previously used to outline client to load balancer pooled machines, we can illustrate the complexity of multiple logical servers.

Before, the colors (blue, red, and green) represented different resource pools within the load balancer.  Let us add in a layer that then says that these finance, marketing, and other resource pools are also then mapped to 3 different logical server pools.  The diagram does not change since the finance resource pool would be mapped to the finance logical server.  What we have enforced, at this point, is that no finance query will run on any host other than those in the finance logical server (red box).

 

This now allows us to control query placement both in the logical server and inside SAP IQ.  The complexity enters when the definition of €œfinance€ changes in SAP IQ or the load balancer.  Imagine that the finance logical server was expanded to have a 3rdcompute node (node 3).  The load balancer is not aware of this additional compute node and will only direct traffic to the two nodes that is knows about (node 1 and node 2).   Node 3 will not receive any direct work from the load balancer.

 

Leveraging both logical servers with load balancers adds to complexity as it forces a tight coupling between load balancer resource pools and SAP IQ logical servers.  This tight coupling levies the requirement that when a change is made to either the load balancer or SAP IQ, the other must be updated.  This is a manual process today and one that must be handled by the network and database administrators when changes are made.

SAP IQ Feature Distributed Query Processing and the Impact on Load Balancers

An SAP IQ feature that bears some explanation is Distributed Query Processing.  Distributed Query Processing (DQP) spreads query processing across multiple servers in a SAP IQ Multiplex.

 

When the SAP IQ query optimizer determines that a query might require more CPU resources than are available on a single node, it will attempt to break the query into parallel €œfragments€ that can be executed concurrently on other servers in the multiplex cluster. DQP is the process of dividing the query into multiple, independent pieces of work, distributing that work to other nodes in the grid, and collecting and organizing the intermediate results to generate the final result set for the query.

 

It is important to emphasize that if a query does not fully utilize the CPU resources on a single machine, then it will usually not be advantageous to distribute it.

 

DQP will break the query into fragments.  Those fragments are only run within the logical server that the user is part of as part of their login policy.  Should a node that is processing a fragment go offline, DQP handles this by resubmitting the query fragment to another, online node without disruption to the application.

 

 

This is important when it comes to an architecture with a load balancer.  If the user connects to the load balancer and is passed to a node not in their logical server, the connection will automatically be redirected to a random node within their logical server.  This adds time to the connection process.  It is also important to be aware of this feature as it drives the need for a much tighter coupling of the load balancer pools and the SAP IQ logical servers.  For optimal performance and configuration, the load balancer pools must map to the proper physical servers within the user€™s logical server.

 

Should there be just a single logical server for all query access, this mapping is quite simple.  It increases in complexity as more and more logical servers are added to SAP IQ as care must be taken to properly map the load balancer pools to logical servers.

LOAD BALANCER ALTERNATIVES

Round Robin DNS

Round Robin DNS is a technique of load distribution, load balancing, or fault-tolerance provisioning multiple, redundant Internet Protocol service hosts, e.g., Web server, FTP servers, by managing the Domain Name System’s (DNS) responses to address requests from client computers according to an appropriate statistical model.

 

In its simplest implementation, Round-robin DNS works by responding to DNS requests not only with a single potential IP address, but with one out of a list of potential IP addresses corresponding to several servers that host identical services. The order in which IP addresses from the list are returned is the basis for the term round robin. With each DNS response, the IP address sequence in the list is permuted. Usually, basic IP clients attempt connections with the first address returned from a DNS query, so that on different connection attempts, clients would receive service from different providers, thus distributing the overall load among servers.

 

Customers have implemented SAP IQ with round robin DNS with success for query operations.  For the most basic of use cases, spread query workload across multiple machines, round robin DNS works quite well and is easy to setup and use.  It is not widely used due to the limitations and additional impact that it can have on the network infrastructure.

 

It is important to understand that round robin DNS only covers client/server connections that leverage DNS resolution for hostname lookups.  Should the client use an IP address, it would bypass all round robin logic and thus defeat the purpose.

 

Care should be taken with this approach as systems and applications can handle DNS hostname resolution in different ways.  Most DNS clients will cache the DNS resolution for a period of time so as to reduce the load on the DNS server infrastructure.

 

Java, for instance, will cache the DNS lookup information for the duration of the JVM or for a set period of time. This connection information is then reused for subsequent connections to that same host, thereby destroying any type of round robin connection distribution.

 

The resolution for Java is quite simple as outlined here: http://docs.oracle.com/javase/8/docs/technotes/guides/net/properties.html

networkaddress.cache.ttl

Specified in java.security to indicate the caching policy for successful name lookups from the name service. The value is specified as integer to indicate the number of seconds to cache the successful lookup.

 

A value of -1 indicates “cache forever”. The default behavior is to cache forever when a security manager is installed, and to cache for an implementation specific period of time, when a security manager is not installed.

 

While we can resolve the issue with DNS caching by tweaking the cache options in Java (with similar features in ODBC), we will likely introduce another issue.  This is a case where the cure may be worse than the disease.  DNS caching is used to reduce the DNS server workload.  By disabling caching, we will now greatly increase the amount of work that the DNS server must do.  This will eventually force the need for an increase in the number of DNS servers.

 

Use this approach with caution and with the knowledge that it may have rippling effects within the network infrastructure.

SAP IQ SHARED NOTHING ARCHITECTURE

SAP IQ no longer supports a full shared nothing architecture.  This feature, when released, was not widely adopted by customers and was eventually removed from the product.

SAP IQ does support a shared nothing architecture with respect to CPU and memory.  The main store, user data, is shared across all nodes.  The shared storage infrastructure allows SAP IQ to overcome some of the issues with a shared nothing architecture, like CPU and RAM hotspots.  When just one node owns a subset of the data, all queries and loads must be sent to that node, thus creating hot spots and contention between queries and loads.  The SAP IQ architecture allows for compute and RAM to be isolated to read and write activities so that load performance won’t degrade as more users query the data.  Conversely, query performance won’t degrade when data loads run because they each have isolated CPU and memory.

SAP IQ LOGICAL SERVERS

SAP IQ has a concept of logical servers as a way to present a series of physical servers to the client without them having to understand the actual topology.

 

Logical servers let you to manage the use of multiplex resources more effectively. Use logical servers to assign different sets of physical multiplex servers to different applications, users, or groups to meet their individual performance requirements.

 

In a multiplex, each connection operates under a single logical server context. When you submit a query to a multiplex server, its execution may be distributed to one or more multiplex servers, depending upon the configuration of the connection’s logical server. To dynamically adjust the resources assigned to a logical server, add or remove multiplex servers from the logical server to meet the changing needs of the applications that it serves.

 

Logical servers can also be setup to distribute connections to other nodes in the server upon connect.  This should not be confused with load balancing, however.  The logical server construct will accept connections to a node in the server pool.  The connection will stay connected to that server so long as there are open connections.  The only time that a connection will be sent to another server is when the current server has no more open connections.  In essence, the pool of connections must be exhausted before SAP IQ redirects the connection.

 

Where load balancing attempts to evenly distribute workload across two or more nodes, logical servers will saturate one node, then move connections to a second node.

 

Logical servers do not provide a mechanism for load balancing and should not be considered for load balancing scenarios.

CRITICAL NOTES ABOUT CLIENT CONNECTIVITY

Anytime a load balancer (hardware, software, or DNS round-robin) is used, we must make sure to properly configure the SAP IQ client.  By default, the SAP IQ Java and ODBC perform strong checking between the client and server.  The client must specify the SAP IQ server and database name as part of the connection string.  The server then checks the server name and database name to validate that what the client requested is actually what the server has under its control.  Should this security check fail, the connection will be denied.

 

When the client connects to SAP IQ via a load balancer, the end SAP IQ server is not known.  This is part of the obfuscation that the load balancer brings.  Consequently, all connections to SAP IQ will fail except those few that randomly got connected to the server they specified.

 

The change is quite simple and can be applied in the Java or ODBC connection string.  The client drivers have parameters that can be passed to the server (username, password, etc.).  One of these optional parameters allows for disabling of the strong security check.  IQ sees that option and does not perform the server name and database name checks.  It defaults back to just checking the username and password (or certificate) for authentication.

 

To change to this behavior, the connection string need only specify €œVerifyServerName=OFF€ or €œVerifyServerName=NO€ in the TCP/IP settings of the connection string (Verify is also acceptable and is the shortened syntax for this option).  To illustrate using Interactive-SQL (dbisql):

dbisql -c "links=tcpip{verify=no;host=localhost;port=2638};uid=dba;pwd=sql"

If the option is omitted, this type of error will be seen:

Could not connect to the database.
Connection error: TCPIP requires a server name
SQLCODE=-832, ODBC 3 State="08001"

 

Unfortunately, there is no server side setting to change this behavior.  This will have to be done on the client side application or server side application servers.

 

The VerifyServerName (VERIFY) Communication Parameter states that “Normally, you should not set this option. It is used only for connecting to multiplex secondary servers to balance query loads among these servers.€ And €œWhen an IQ client connects to this DSN, the load balancer dispatches the connection to a particular secondary server based on the workload of the machine.”  Read this as it is needed if you wish to connect to SAP IQ nodes through some sort of intermediary step like a load balancer.

 

Details can be found at http://help.sap.com in the current SAP IQ manuals.

Summary

SAP IQ can leverage a number of load balancers from hardware to software.  The inherent design to SAP IQ allows itself to be easily integrated into any load balancing scenario from hardware to software to DNS round-robin.

Be the first to leave a comment
You must be Logged on to comment or reply to a post.