Skip to Content
How to use NUMA featured hardware on Linux

Nowadays more and more NUMA enabled hardware can be found on customer site. If you are using Linux this article gives you some hints, how to set up your system properly to get the best performance out of your hardware. Documentation what NUMA means, what is is and what are the pro’s and con’s can be found on several locations on the internet. I assume you know that you have NUMA hardware and know what this means.

First of all I’d like to start with a short introduction of the theoretical system we have and how we get some information from within the Linux OS. After that I give some suggestions how the SAP system can be configured for you needs. All these suggestions were taken from several SD-Benchmarks done in the SAP LinuxLab while testing the NUMA architecture.

Theoretical Hardware and setup

Our theoretical hardware could look like this. The machine has 4 system boards, which we from now on call nodes. Each node has one cpu and two gigabytes of memory. The whole machine therefore has a total of 4 CPU’s and 8 GB main memory. On this piece of hardware we run a big 2-Tier system with MaxDB and a R/3 Enterprise 4.7 using the following profile:

rdisp/wp_no_dia =15 rdisp/wp_no_btc =3 rdisp/wp_no_vb =5 rdisp/wp_no_vb2 =3 rdisp/wp_no_enq =1 rdisp/wp_no_spo =2

Of course the biggest problem is obvious. As we have one big system, all four CPU’s don’t only use the local memory but also all the other memory available on the other nodes. This is *not* good. You probably already know , that remote memory accesses are pretty far slower that local accesses having a NUMA architecture. As an example for our theoretical hardware, we might have 100ns local memory access time and 600ns remote memory access times. Of course our goal should be only using local memory to reduce the memory access.

Do I have NUMA hardware?

An easy way to find out, is to execute the following command and read the output carefully. This would be the output of our theoretical machine:

root@linux:> numactl --hardware available: 4 nodes (0-3) node 0 size: 2048 MB node 0 free: 1454 MB node 1 size: 2044 MB node 1 free: 1932 MB node 2 size: 2036 MB node 2 free: 2024 MB node 3 size: 2044 MB node 3 free: 2036 MB

We now clearly see, that we have a machine with 4 nodes and 2GB memory on each node. If your output doesn’t look similar, you probably don’t have a NUMA machine. Now that we know the number of available nodes and memory it is time to set up the system in a suitable way for the NUMA architecture.

How to configure the SAP System

To apply the rule of memory localozation, each of the four nodes of our machine has to do work that is completely separated from the work of the other nodes. That’s why we put the Database on one node, the central instance on another node, and two more dialog instances on the other nodes.

Node 0:

Please take into account that the operating system is using the memory on node 0. We better put the database on node 3 where we have enough free space. The central instance will be placed on node 0 but this is up to you on which node you put the different instances. I would set the central instance up like this:

rdisp/wp_no_dia =3 rdisp/wp_no_btc =3 rdisp/wp_no_vb =1 rdisp/wp_no_vb2 =1 rdisp/wp_no_enq =1 rdisp/wp_no_spo =2

Please keep in mind to adjust your buffers that only 1.5GB will be allocated!

Node 1 and 2:

For both dialog instances I would use:

rdisp/wp_no_dia =6 rdisp/wp_no_vb =2 rdisp/wp_no_vb2 =1 rdisp/vb_dispatching = 0 rdisp/vbname = ${HOSTNAME}_${SAPSYSTEMNAME}_${SAPSYSNR}

The buffers of both instances shouldn’t exceed around 1.9GB! The “vb” parameters control where the update request will be executed. We do not dispatch them in order to keep the memory locally. ONo need to mention that you have to set the three variables above (HOSTNAME,SAPSYSTEMNAME,SAPSYSNR) to the values of your system.

Node 3:

The database will run on this node. It is important that the database cache does not exceed the free space on this node. With 2036MB free space we’ll set CACHE_SIZE = 256000 (8KB pages) which is 2000MB at all. Furthermore we set MAXCPU = 2 because each node has 2 CPU’s.

Startup

The MaxDB database must be started first. To do this, execute the following command as sidadm:

sidadm@linux:> numactl -c 3 -m 3 startdb

After that, we start the central instance and the dialog instances with:

sidadm@linux:> numactl -c 0 -m 0 startasp DVEBMGSxx sidadm@linux:> numactl -c 1 -m 1 startsap Dxx sidadm@linux:> numactl -c 2 -m 2 startsap Dxx

Now all processes only run on the nodes specified by numactl. If in any case a work process has to be restarted the new one also runs on that specific node. You can check this by pressing first f then j while running ‘top’. An extra column with the number of the CPU used will appear. You will then see that all DVEBMGSxx processes will run on CPU numbered 0 and 1 (first two CPU’s are located by default on node 0 and so on)

I would be very interested if anyone already implemented SAP on Linux/NUMA architecture already.

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply