[Oracle] Myths and common misconceptions about (transparent) huge pages for Oracle databases (on Linux) uncovered
Introduction
In the past years i have worked a lot with mission critical Oracle databases in highly consolidated or centralized environments and noticed several myths and common misconceptions about the memory management for Oracle databases on Linux (mainly SLES and OEL).
This blog covers the basics of the relevant memory management for Oracle databases on Linux and tries to clarify several myths. I just start with the common ones and maybe extend this blog post with several new and interesting little details over the time. It should be something like a central and sorted collection of relevant information.
Definition and insights into huge pages and transparent huge pages
Official Linux Documentation
Huge Pages and Transparent Huge Pages
Memory is managed in blocks known as pages. A page is 4096 bytes. 1MB of memory is equal to 256 pages; 1GB of memory is equal to 256,000 pages, etc. CPUs have a built-in memory management unit that contains a list of these pages, with each page referenced through a page table entry.
There are two ways to enable the system to manage large amounts of memory:
- Increase the number of page table entries in the hardware memory management unit
- Increase the page size
The first method is expensive, since the hardware memory management unit in a modern processor only supports hundreds or thousands of page table entries. Additionally, hardware and memory management algorithms that work well with thousands of pages (megabytes of memory) may have difficulty performing well with millions (or even billions) of pages. This results in performance issues: when an application needs to use more memory pages than the memory management unit supports, the system falls back to slower, software-based memory management, which causes the entire system to run more slowly.
Red Hat Enterprise Linux 6 implements the second method via the use of huge pages.
Simply put, huge pages are blocks of memory that come in 2MB and 1GB sizes. The page tables used by the 2MB pages are suitable for managing multiple gigabytes of memory, whereas the page tables of 1GB pages are best for scaling to terabytes of memory.
Huge pages must be assigned at boot time. They are also difficult to manage manually, and often require significant changes to code in order to be used effectively. As such, Red Hat Enterprise Linux 6 also implemented the use of transparent huge pages (THP). THP is an abstraction layer that automates most aspects of creating, managing, and using huge pages.
THP hides much of the complexity in using huge pages from system administrators and developers. As the goal of THP is improving performance, its developers (both from the community and Red Hat) have tested and optimized THP across a wide range of systems, configurations, applications, and workloads. This allows the default settings of THP to improve the performance of most system configurations.
Note that THP can currently only map anonymous memory regions such as heap and stack space.
Huge Translation Lookaside Buffer (HugeTLB)
Physical memory addresses are translated to virtual memory addresses as part of memory management. The mapped relationship of physical to virtual addresses is stored in a data structure known as the page table. Since reading the page table for every address mapping would be time consuming and resource-expensive, there is a cache for recently-used addresses. This cache is called the Translation Lookaside Buffer (TLB).
However, the TLB can only cache so many address mappings. If a requested address mapping is not in the TLB, the page table must still be read to determine the physical to virtual address mapping. This is known as a “TLB miss”. Applications with large memory requirements are more likely to be affected by TLB misses than applications with minimal memory requirements because of the relationship between their memory requirements and the size of the pages used to cache address mappings in the TLB. Since each miss involves reading the page table, it is important to avoid these misses wherever possible.
The Huge Translation Lookaside Buffer (HugeTLB) allows memory to be managed in very large segments so that more address mappings can be cached at one time. This reduces the probability of TLB misses, which in turn improves performance in applications with large memory requirements.
*** Side Note: Transparent Huge Pages (THB) support was officially announced with Linux kernel version 2.6.38.
Oracle Documentation addition
HugePages is a feature integrated into the Linux kernel with release 2.6. This feature basically provides the alternative to the 4K page size (16K for IA64) providing bigger pages.
Regarding the HugePages, there are some other similar terms that are being used like, hugetlb, hugetlbfs. Before proceeding into the details of HugePages, see the definitions below:
- Page Table: A page table is the data structure of a virtual memory system in an operating system to store the mapping between virtual addresses and physical addresses. This means that on a virtual memory system, the memory is accessed by first accessing a page table and then accessing the actual memory location implicitly.
- TLB: A Translation Lookaside Buffer (TLB) is a buffer (or cache) in a CPU that contains parts of the page table. This is a fixed size buffer being used to do virtual address translation faster.
- hugetlb: This is an entry in the TLB that points to a HugePage (a large/big page larger than regular 4K and predefined in size). HugePages are implemented via hugetlb entries, i.e. we can say that a HugePage is handled by a “hugetlb page entry”. The ‘hugetlb” term is also (and mostly) used synonymously with a HugePage. In this document the term “HugePage” is going to be used but keep in mind that mostly “hugetlb” refers to the same concept.
- hugetlbfs: This is a new in-memory filesystem like tmpfs and is presented by 2.6 kernel. Pages allocated on hugetlbfs type filesystem are allocated in HugePages.
Graphical illustration of regular (normal) and huge pages
When a single process works with a piece of memory, the pages that the process uses are reference in a local page table for the specific process. The entries in this table also contain references to the System-Wide Page Table which actually has references to actual physical memory addresses. So theoretically a user mode process (i.e. Oracle processes), follows its local page table to access to the system page table and then can reference the actual physical table virtually. As you can see below, it is also possible (and very common to Oracle RDBMS due to SGA use) that two different O/S processes can point to the same entry in the system-wide page table.
When HugePages are in the play, the usual page tables are employed. The very basic difference is that the entries in both process page table and the system page table has attributes about huge pages. So any page in a page table can be a huge page or a regular page. The following diagram illustrates 4096K hugepages but the diagram would be the same for any huge page size.
I guess this should be enough general information about huge pages and transparent huge pages to understand the concepts and basics of it. Please check the reference section, that it includes more detailed information, if you are interested into it (like performance comparison).
Why do we care about such memory handling at all and what are the advantages?
Well as i have previously mentioned i worked with Oracle databases in highly consolidated or centralized environments in the past years and in such environments there is a lot of thinking about “how to utilize the infrastructure and hardware in the best way”. Just imagine a distributed SAP system landscape with a centralized Oracle database infrastructure (like on VMware or whatever). How can you put as much as possible databases on such a infrastructure without harming the performance of each other? “Classical” database or SQL tuning is important for reducing the I/O, CPU and memory load of course, but you can also tune the operating system to get a much better utilization and throughput.
… and so we get into memory management as well. RAM is still the most expensive and limiting hardware part and so we don’t want to waste it without a valid reason. So we finally reached the use case of regular, huge and transparent huge pages for Oracle databases.
Jonathan Lewis has already written a blog post about a memory usage issue after a database migration from a 32 to 64-bit operating system and mentioned a solution called “huge pages” for it.
“A client recently upgraded from 32-bit Oracle to 64-bit Oracle because this would allow a larger SGA. At the same time they increased their SGA from about 2GB to 3GB hoping to take more advantage of their 8GB of RAM. The performance of their system did not get better – in fact it got worse.
…
It is important background information to know that they were running a version of Red Hat Linux and that there were typically 330 processes connected to the database using an average of about 4MB of PGA each.
Using small memory pages (4KB) on a 32-bit operating system the memory map for a 2GB SGA would be: 4 bytes for each of 524,288 pages, totalling 2MB per process, for a grand total of 660MB memory space used for mapping when the system has warmed up. So when the system was running at steady state, the total memory directly related to Oracle usage was: 2GB + 660MB + 1.2GB (PGA) = 3.8GB, leaving about 4.2GB for O/S and file system cache.
…
Upgrade to a 64-bit operating system and a 3GB SGA and you need 8 bytes for each page in the memory map and have 786,432 pages, for a total of 6MB per process, for a total of 1,980 MB of maps – an extra 1.3GB of memory lost to maps. Total memory directly related to Oracle usage: 3GB + 1.9GB + 1.2GB (PGA) = 6.1GB, leaving about 1.9GB for O/S and file system cache.
“
This example is about a pretty tiny SGA – now think about databases with a much larger cache size or a lot of databases with a small cache size (in a highly consolidated environment) and scale it up – i think you get the point here.
Advantages of huge pages
- Larger Page Size and Less # of Pages: Default page size is 4K whereas the HugeTLB size is 2048K. That means the system would need to handle 512 times less pages.
- Reduced Page Table Walking: Since a HugePage covers greater contiguous virtual address range than a regular sized page, a probability of getting a TLB hit per TLB entry with HugePages are higher than with regular pages. This reduces the number of times page tables are walked to obtain physical address from a virtual address.
- Less Overhead for Memory Operations: On virtual memory systems (any modern OS) each memory operation is actually two abstract memory operations. With HugePages, since there are less number of pages to work on, the possible bottleneck on page table access is clearly avoided.
- Less Memory Usage: From the Oracle Database perspective, with HugePages, the Linux kernel will use less memory to create pagetables to maintain virtual to physical mappings for SGA address range, in comparison to regular size pages. This makes more memory to be available for process-private computations or PGA usage.
- No Swapping: We must avoid swapping to happen on Linux OS at all. HugePages are not swappable (whereas regular pages are). Therefore there is no page replacement mechanism overhead. HugePages are universally regarded as pinned.
- No ‘kswapd’ Operations: kswapd will get very busy if there is a very large area to be paged (i.e. 13 million page table entries for 50GB memory) and will use an incredible amount of CPU resource. When HugePages are used, kswapd is not involved in managing them.
Myth 1 – We are running a Linux kernel version, that supports transparent huge pages and so the Oracle database already uses huge pages for the SGA
This is a common myth in newer Oracle / Linux system landscapes, but unfortunately not true at all. Starting with RedHat 6, OEL 6, SLES 11 SP 2 and UEK2 kernels, transparent huge pages are implemented and enabled (by default) in an attempt to improve the memory management, but not every kind of memory is currently supported.
The following information is grabbed from an Oracle Enterprise Linux 6.2 (2.6.39-100.7.1.el6uek.x86_64) and run with an Oracle database 11.2.0.3.2. The instance uses manual memory management (no AMM or ASMM) to keep it simple.
[root@OEL11 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
Transparent huge pages are enabled as “always” is included in brackets – so let’s verify it with a running Oracle database.
*** Before database startup
[root@OEL11 ~]# cat /proc/meminfo | grep AnonHugePages
AnonHugePages: 0 kB
*** Database is started up with db_cache_size=300M, shared_pool_size=200M,
*** pga_aggregate_target=100M and pre_page_sga=TRUE
SQL> startup
ORACLE instance started.
Total System Global Area 559575040 bytes
*** After database startup
[root@OEL11 ~]# cat /proc/meminfo | grep AnonHugePages
AnonHugePages: 4096 kB
That seems to be pretty strange, right? The SGA is round about 500 MB and fully allocated, but only 4 MB of transparent huge pages are currently used. What’s wrong here? Let’s check each database process for its memory usage.
[root@OEL11 ~]# for PRID in $(ps -o pid -u orat11)
do
THP=$(cat /proc/$PRID/smaps | grep AnonHugePages | awk '{sum+=$2} END {print sum}')
echo "PID: $PRID - AnonHugePages: $THP"
done
PID: 11903 - AnonHugePages: 0
...
PID: 12653 - AnonHugePages: 0
PID: 12655 - AnonHugePages: 4096
PID: 12657 - AnonHugePages: 0
...
PID: 12926 - AnonHugePages: 0
[root@OEL11 ~]# ps -ef | grep 12655
orat11 12655 1 0 15:54 ? 00:00:00 ora_dbw0_T11
The DBWR is using the 4 MB of transparent huge pages only, but nothing compared to the SGA size, right?
In reality there is nothing wrong – it works as designed, if we check the kernel documentation for transparent huge pages:
[root@OEL11 ~]# cat /usr/share/doc/kernel-doc-2.6.39/Documentation/vm/transhuge.txt
...
Transparent Hugepage Support is an alternative means of using huge pages for the backing of virtual
memory with huge pages that supports the automatic promotion and demotion of page sizes and
without the shortcomings of hugetlbfs.
Currently it only works for anonymous memory mappings but in the future it can expand over the
pagecache layer starting with tmpfs.
...
.. and here we go .. transparent huge pages are currently supported for anonymous memory (like PGA heap) only and nothing else. So the SGA (shared memory) still uses the regular page size and transparent huge pages are not useful here to reduce the mapping overhead.
IMPORTANT HINT: Due to known problems – Oracle does not recommend transparent huge pages at all (even not for PGA heap) – please check the reference section (MOS ID 1557478.1 or SAPnote #1871318) for details about deactivating this feature.
“Because Transparent HugePages are known to cause unexpected node reboots and performance problems with RAC, Oracle strongly advises to disable the use of Transparent HugePages. In addition, Transparent Hugepages may cause problems even in a single-instance database environment with unexpected performance problems or delays. As such, Oracle recommends disabling Transparent HugePages on all Database servers running Oracle.”
Myth 2 – Huge pages are difficult to manage in highly consolidated and critical environments
This myth was true at all (and is still true in various cases nowadays), but Oracle has improved the procedure for allocating huge pages with Oracle patchset 11.2.0.3.
Let’s clarify the root problem first. Imagine a highly consolidated Oracle database system landscape on several physical or virtual hosts. You needed to calculate and define the amount of huge pages for all databases on that particular host in the pre Oracle 11.2.0.3 times. This works pretty well, if you have a stable number of instances/databases (with a fixed memory size), but what if you need to add several new instances/databases to your production server. You can not adjust the corresponding kernel parameters (and maybe need to reboot the server) manually, just because of a newly deployed instance/database. Otherwise the instance would allocate the whole SGA memory size in regular pages (4 kb), if the SGA of the new instance/database does not fit into the remaining free huge pages area. This can cause nasty paging trouble as the memory calculating is based on using large pages. Or think about automated database provisioning – of course you could size the huge page area that big, that you never run into a problem, but then we have missed the original goal of using the hardware resources as effective as possible.
Let’s check out the improvements of Oracle 11.2.0.3 for the huge page handling. The following information is grabbed from an Oracle Enterprise Linux 6.2 (2.6.39-100.7.1.el6uek.x86_64) and run with an Oracle database 11.2.0.3.2. The instance uses manual memory management (no AMM or ASMM) to keep it simple and transparent huge pages are disabled.
Initial settings for every parameter setting test
[root@OEL11 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
[root@OEL11 ~]# cat /proc/meminfo | grep Huge
HugePages_Total: 150
HugePages_Free: 150
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Transparent huge pages are disabled as “never” is included in brackets and round about 300 MB of memory is assigned to the huge pages “pool” and still free.
The SGA of my Oracle instance is still round about 500 MB – so it usually would not be able to allocate all the memory as huge pages. Let’s verify the different behaviors with parameter “use_large_pages”.
Parameter use_large_pages=TRUE (= Default)
Let’s check the default behavior of Oracle 11.2.0.3 first.
*** Database is started up with db_cache_size=300M, shared_pool_size=200M,
*** pga_aggregate_target=100M, pre_page_sga=TRUE and use_large_pages=TRUE
SQL> startup
ORACLE instance started.
Total System Global Area 559575040 bytes
*** Alert Log
****************** Large Pages Information *****************
Total Shared Global Region in Large Pages = 300 MB (55%)
Large Pages used by this instance: 150 (300 MB)
Large Pages unused system wide = 0 (0 KB) (alloc incr 4096 KB)
Large Pages configured system wide = 150 (300 MB)
Large Page size = 2048 KB
*** After database startup
root@OEL11 ~]# cat /proc/meminfo | grep Huge
AnonHugePages: 0 kB
HugePages_Total: 150
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
root@OEL11 ~]# ipcs -a
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x00000000 65537 orat11 640 12582912 25
0x00000000 98306 orat11 640 276824064 25
0x00000000 131075 orat11 640 20971520 25
0x00000000 163844 orat11 640 4194304 25
0x00000000 196613 orat11 640 247463936 25
0x4eb56684 229382 orat11 640 2097152 25
As you can see Oracle has used all the available huge pages first and after it run out it used regular pages for the rest. Several shared memory segments are created and used as a side effect of this enhancement.
Parameter use_large_pages=ONLY
Let’s check the parameter value “use_large_pages=ONLY” and its behavior, if there are not sufficient large pages at database startup.
*** Database is started up with db_cache_size=300M, shared_pool_size=200M,
*** pga_aggregate_target=100M, pre_page_sga=TRUE and use_large_pages=ONLY
SQL> startup
ORA-27137: unable to allocate large pages to create a shared memory segment
Linux-x86_64 Error: 12: Cannot allocate memory
*** Alert Log
****************** Large Pages Information *****************
Parameter use_large_pages = ONLY
Large Pages unused system wide = 150 (300 MB) (alloc incr 4096 KB)
Large Pages configured system wide = 150 (300 MB)
Large Page size = 2048 KB
ERROR:
Failed to allocate shared global region with large pages, unix errno = 12.
Aborting Instance startup.
ORA-27137: unable to allocate Large Pages to create a shared memory segment
As you can see we can also force the instance to use large pages only for the whole SGA and the startup fails with an ORA-27137 error, if not enough large pages are available. This setting is usually used to avoid an out of memory situation based on a mix of regular and large pages (like the default behavior).
Parameter use_large_pages=AUTO
This is a completely new introduced option with Oracle 11.2.0.3 – let’s verify its impact, if there are not sufficient large pages at database startup.
*** Database is started up with db_cache_size=300M, shared_pool_size=200M,
*** pga_aggregate_target=100M, pre_page_sga=TRUE and use_large_pages=AUTO
SQL> startup
ORACLE instance started.
Total System Global Area 559575040 bytes
*** Alert Log
DISM started, OS id=1610
****************** Large Pages Information *****************
Parameter use_large_pages = AUTO
Total Shared Global Region in Large Pages = 538 MB (100%)
Large Pages used by this instance: 269 (538 MB)
Large Pages unused system wide = 0 (0 KB) (alloc incr 4096 KB)
Large Pages configured system wide = 269 (538 MB)
Large Page size = 2048 KB
Time taken to allocate Large Pages = 0.025895 sec
***********************************************************
*** After database startup
[root@OEL11 trace]# cat /proc/meminfo | grep Huge
HugePages_Total: 269
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
[root@OEL11 trace]# ipcs -a
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x6c6c6536 0 root 600 4096 0
0x00000000 360449 orat11 640 12582912 24
0x00000000 393218 orat11 640 549453824 24
0x4eb56684 425987 orat11 640 2097152 24
As you can see Oracle automatically reconfigured the Linux kernel and increased the amount of huge pages (temporarily), so that the complete SGA fits in. This is possible, if you have enough free(able) memory. You will also notice an unusual startup comment like “DISM started, OS id=1610”, if you look closely at the alert log snippet. DISM is responsible for such tasks like increasing the amount of huge pages or increasing the process priority. For such tasks root privileges are needed – so check the correct permissions (s-bit and owner) for the binary dism.
Myth 3 – Huge pages can not be used for Oracle instances with ASMM (Automatic Shared Memory Management)
This is the most common misconception that i am confronted with. Personally i am not a fan of automatic shared memory management, but some of my clients use it of course. I guess the root cause of this misconception is based on the naming of two similar memory features called ASMM and AMM. So let’s check the official documentation about both features and the huge pages restriction first.
Automatic Shared Memory Management (ASMM)
Automatic Shared Memory Management simplifies SGA memory management. You specify the total amount of SGA memory available to an instance using the SGA_TARGET initialization parameter and Oracle Database automatically distributes this memory among the various SGA components to ensure the most effective memory utilization.
When automatic shared memory management is enabled, the sizes of the different SGA components are flexible and can adapt to the needs of a workload without requiring any additional configuration. The database automatically distributes the available memory among the various components as required, allowing the system to maximize the use of all available SGA memory.
Automatic Memory Management (AMM)
The simplest way to manage instance memory is to allow the Oracle Database instance to automatically manage and tune it for you. To do so (on most platforms), you set only a target memory size initialization parameter (MEMORY_TARGET) and optionally a maximum memory size initialization parameter (MEMORY_MAX_TARGET). The total memory that the instance uses remains relatively constant, based on the value of MEMORY_TARGET, and the instance automatically distributes memory between the system global area (SGA) and the instance program global area (instance PGA). As memory requirements change, the instance dynamically redistributes memory between the SGA and instance PGA.
When automatic memory management is not enabled, you must size both the SGA and instance PGA manually.
Restrictions for HugePages Configurations
- The Automatic Memory Management (AMM) and HugePages are not compatible. With AMM the entire SGA memory is allocated by creating files under /dev/shm. When Oracle Database allocates SGA that way HugePages are not reserved. You must disable AMM on Oracle Database to use HugePages.
- If you are using VLM in a 32-bit environment, then you cannot use HugePages for the Database Buffer cache. HugePages can be used for other parts of SGA like shared_pool, large_pool, and so on. Memory allocation for VLM (buffer cache) is done using shared memory file systems (ramfs/tmpfs/shmfs). HugePages does not get reserved or used by the memory file systems.
- HugePages are not subject to allocation or release after system startup, unless a system administrator changes the HugePages configuration by modifying the number of pages available, or the pool size. If the space required is not reserved in memory during system startup, then HugePages allocation fails.
So basically said – both features are used for automatic memory management, but ASMM is controlling the SGA only and AMM is controlling the SGA and PGA. If you look closely at the restrictions you will see that only AMM is not compatible with huge pages, but ASMM is. AMM is not based on the “classical shared memory segment” – it is implemented by using the /dev/shm “filesystem”.
Initial settings for ASMM huge pages test
[root@OEL11 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]
[root@OEL11 ~]# cat /proc/meminfo | grep Huge
HugePages_Total: 150
HugePages_Free: 150
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
Transparent huge pages are disabled as “never” is included in brackets and round about 300 MB of memory is assigned to the huge pages “pool” and still free.
Using ASMM and check the huge pages behavior
*** Database is started up with sga_target=500M, pga_aggregate_target=100M,
*** pre_page_sga=TRUE and use_large_pages=AUTO
SQL> startup
ORACLE instance started.
Total System Global Area 521936896 bytes
*** Alert Log
DISM started, OS id=1400
****************** Large Pages Information *****************
Parameter use_large_pages = AUTO
Total Shared Global Region in Large Pages = 502 MB (100%)
Large Pages used by this instance: 251 (502 MB)
Large Pages unused system wide = 0 (0 KB) (alloc incr 4096 KB)
Large Pages configured system wide = 251 (502 MB)
Large Page size = 2048 KB
Time taken to allocate Large Pages = 0.022167 sec
***********************************************************
*** After database startup
[root@OEL11 trace]# cat /proc/meminfo | grep Huge
HugePages_Total: 251
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
[root@OEL11 trace]# ipcs -a
------ Shared Memory Segments --------
key shmid owner perms bytes nattch status
0x6c6c6536 0 root 600 4096 0
0x00000000 65537 orat11 640 12582912 23
0x00000000 98306 orat11 640 511705088 23
0x4eb56684 131075 orat11 640 2097152 23
As you can see huge pages and ASMM is fully compatible and even works with the “new automatic huge pages extension feature”.
Summary
Wow – this blog already become quite large, but unfortunately this topic is very wide open and so we needed to cover a lot of the basics first. I will keep extending this blog as soon as i notice new topics or if you ask for something specific.
If you have any further questions – please feel free to ask or get in contact directly, if you need assistance by implementing complex Oracle database landscapes or by troubleshooting Oracle (performance) issues.
References
- MOS ID 361323.1 – HugePages on Linux: What It Is… and What It Is Not…
- MOS ID 1134002.1 – ASMM and LINUX x86-64 Hugepages Support
- MOS ID 1392497.1 – USE_LARGE_PAGES To Enable HugePages In 11.2
- MOS ID 1557478.1 – ALERT: Disable Transparent HugePages on SLES11, RHEL6, OEL6 and UEK2 Kernels
- Oracle (Chris W Beal) – So what the heck is anonymous memory
- Redhat Linux 6 Documentation – Huge Pages and Transparent Huge Pages
- Red Hat (Andrea Arcangeli) – Transparent Hugepage Support (with performance comparison)
- Red Hat (Raghu Udiyar) – Transparent Hugepages in RHEL 6
- SAPnote #1672954 – Oracle 11g: Usage of hugepages on Linux
- SAPnote #1871318 – Linux: Disable Transparent HugePages for Oracle Database
Good One
Reagan
Hi Stefan,
damn, you release your blog fast as hell 😉
Great job, now all information can be found in one place.
What about other operating systems? You just covered Linux. What about AIX, Solaris and Windows etc. ? I think some of your blog readers are also interested in this operating systems. It would be great if you could tell some words to them.
As you know I have tested this feature also on AIX and it is working fine when the DB is running standalone. Then also the SAP system can run with 64KB pages (1387644 - Using 64 KB Virtual Memory Pages Sizes on AIX), but you must have also an eye on the AME feature if you use it (1464605 - Active Memory Expansion (AME) ):
As AME is not recommended with large page support, AME will disable AIX 64KB pages by default.
IBM says that despite enabled AME feature it is possible to use 64KB pages with performance loss. But there is no statement how much performance is lost.
But at all huge pages are really cool to boost and tune your system. The bigger the system (relation to memory) the better it scales with the huge pages.
I think it would be nice, if you could tell some values. How many MB of memory could be saved? Tell us some examples, so the people can imagine if it worth to use it.
Don't stop blogging dude 😉
Best Regards,
Jens
Hi Jens,
yes, i am working hardly on my blog backlog and it seems to go out pretty well 😉
> What about other operating systems? You just covered Linux
Yes, i covered Linux right now as this platform is the most "up-coming" one and it is available to me all the time. However as you know a (small) specialized consultancy like mine can not spend that amount of money for a p790 just for researching .. maybe IBM sponsors it in the future 😛
> What about AIX, Solaris and Windows etc. ?
Ok - i'll try my best here, but as already mentioned you have to trust me as i have none of these platforms available here on my MacBook (*reminder* Finish Oracle database installation on my Solaris 11.1 VM *reminder*) to demonstrate it.
AIX
Yes, huge pages are supported on AIX (but called large pages on that platform) as well (IBM Documentation). There are some limits (like AME mentioned by you), but in general it is the same. The implementation of AMM and large pages works differently as well (Blog Post). I will try to cover this topic, when i have an AIX and Oracle database available.
Solaris
Large pages (DISM and ISM / 4 MB) are used by default, if you run an Oracle database on newer Solaris versions. There are some little quirks about it, but pretty well explained in this white paper. The Oracle marketing machine is pretty right about the "well Oracle engineered systems" here. There seems to be valid reason why Tony Stark uses it too 😉
Windows
Honestly said no idea. But you said the bad "W" word 😛
> How many MB of memory could be saved? Tell us some examples, so the people can imagine if it worth to use it.
Ok, i can do a short calculation (based on our "Brain it up and memorize it down" project). I can not remember the exact basic parameters, but i guess we are in the right ball park here.
Basic parameters
Calculation
UPDATE 25/01/2014: The memory mapping calculation is wrong / too high for huge pages and Linux as huge page tables are shared. For more information check my comment below about Mark Bobak's blog post.
Regards
Stefan
Hi Stefan,
Great explanations - big thanks for this!
I just wanted to kidding you with the bad "W" word 😛
But I think a lot of people in smaller environments use the "W" OS in connection with VMware, so it would be also interesting for them. So I see you still resist to use/administrate microsoft products 😉 But you know that eventually comes the time you can't resist such a project 😆
Allow me one last question:
Do you can give us a rule of thumb, with which scenario it is worth to use large/huge pages? Do you say at any time it is useful or with the scenarios like Dataguard you should use it. Depends the useage maybe on the amount of RAM or processes?
You know in our world (manager triggered) it counts only what does it cost and what is our benefit.
Thanks in advance!
Cheers,
Jens
Hi Jens,
> But I think a lot of people in smaller environments use the "W" OS in connection with VMware, so it would be also interesting for them. So I see you still resist to use/administrate microsoft products
I do not "resist", but such environments are not my focus in general. My clients are usually multi-billion-dollar enterprises with a corresponding (high performance / transactional / availability) Oracle core infrastructure. None of them are using Windows OS for that (just think about the Windows In-Place upgrades in HA environments, downtime due to patch days and so on). For small / not critical environments and dedicated Oracle / SAP solutions it is an option of course, but then you usually do not have to go to that optimization level 😉
> Do you can give us a rule of thumb, with which scenario it is worth to use large/huge pages?
You can/should use it at any time (if AMM is not mandatory on Linux), but the effect is increasing in a large (consolidated) infrastructure of course. Just use the formula from above.
The "saved" memory can be used for other things like database buffer cache, SAP application buffers or maybe for running more instances on the same infrastructure (hardware utilization). You don't need to think about huge pages, if you have too much memory left, but this is usually not the case or you maybe have sized the infrastructure wrong 😉
> You in our world (manager triggered) it counts only what does it cost and what is our benefit.
I know. The stakeholders are usually pretty happy, if they can run their infrastructures more efficiently "for free" just by changing a few parameters. So 100 % benefit - 0 cost (except me 😛 )
Regards
Stefan
Hi guys,
just a short update to my previous calculation.
Mark Bobak has recently published a blog post called "If you’re not using hugepages, you’re doing it wrong!". It states that huge page tables are shared and in consequence the processes are not holding their own copy of (huge) page table entries. My previous calculated values were based on an AIX reorganization project, but with Linux and huge pages you need even less memory as all (Oracle) processes are sharing the huge page table.
Thanks to Mark for pointing that out - i was not aware of that on Linux.
Regards
Stefan
Hi Jens,
for completeness here is the proof of my previous large page statement for Oracle on Solaris. The following snippet is from an Oracle 12.1.0.1 database on Solaris 11.1 (x86_64). The SGA is round about 600 MB (shared memory segment ID 7) and uses 2 MB (= intel / x86 default) large pages by default.
Regards
Stefan
nice blog, thanks for the insight!
Excellent blog Stefan.
Really useful.