SAP HANA – Scale-up or Scale-out Hardware?
I have a lot of customer conversations about SAP HANA hardware. It’s no wonder, given that there are nearly 500 certified appliances on the Certified SAP HANA Hardware Directory at the time of writing.
In addition, it’s possible to certify almost any sane configuration with Enterprise Storage like Violin or EMC VMAX, and it’s possible to use almost any Intel server for non-production use cases, with almost any configuration.
This provides fantastic flexibility – as a customer you can choose the vendor, storage, networking of your choice and for non-production scenarios, it is possible to build systems which are much more cost-effective. The most important thing though, is to get your production hardware correctly provisioned: it can be an expensive mistake.
There are two ways to scale SAP HANA into large systems – up, or out.
The first thing to remember is that HANA systems require a CPU to RAM ratio, which is fixed for production systems, at 256GB/socket for analytic use cases, and 768GB/socket for SAP Business Suite. Mainstream Intel systems are available with 4-8 sockets, which means that there is a maximum of 2TB for analytics and 6TB for Business Suite customers with today’s hardware, in a single system.
How does Scale-Up work?
With scale-up, we look to build a single system with as many resources as possible. As mentioned above, there is a maximum of 8 socket, 2TB for analytics use cases. These are available from Cisco, Hitachi, HP, IBM, Lenovo, Fujitsu, Huawei and SGI at the time of writing, but the link will update with the latest available systems. Those same vendors make 6TB systems for Business Suite.
There are two vendors who make systems larger than this, and certification is pending; my team has worked on pilot implementations.
HP have what they affectionately call the DragonHawk (what is it with IT vendors and their naming conventions?). The marketers call this the HP ConvergedSystem 900, and it is available with up to 16 sockets and 4TB for analytics, or 12TB for Business Suite. The HP CS900 uses their Superdome 2 architecture, which is 2-socket blades with a NUMA backplane, up to 8 blades.
SGI have their SGI UV300H appliance, available in building blocks of 4-sockets with up to 8 building blocks to 32 sockets and 8TB for analytics, or 24TB for Business Suite. They use a proprietary connector called NUMAlink, which allows all CPUs to be a single hop from each other.
Bear in mind that bigger scale-up systems will come, as newer generations of Intel CPUs come around. The refresh cycle is roughly every 3-4 years, with the last refresh happening in 2013.
How does Scale-Out work?
Scale-out systems connect a cluster of smaller SAP HANA systems together into one clustered database. HANA is a shared-nothing architecture, so there must be shared storage for data persistence. This is delivered either with a clustered filesystem (Lenovo with IBM GPFS) or a SAN (all the other vendors).
Interestingly, there is a lot of variance in HANA scale-out appliances. Cisco, Hitachi, HP, Huawei, Lenovo, Dell, Fujitsu and IBM have 1TB scale-0ut appliances. That list drops to Hitachi, Lenovo, Fujitsu and HP for 2TB appliances. IBM have an impressive 56-node cluster (up to 112TB, yes) certified, and all the others are limited to 16-nodes for certified appliances.
HP have a scale up-and-out solution with the ConvergedSystem 900, up to 16-CPU/4TB building blocks (not certified yet). However, the ConvergedSystem 900 has higher average latency than a 4- or 8-socket single-node system, so is best suited to Business Suite use cases.
Do note that this isn’t a major limitation – any of these vendors will certify an appliance as big as your pocket book. In most cases, we find customers buy 5-10TB of HANA database in production.
Note that in a scale-out environment, data has to be distributed amongst the nodes. SAP BW does a great job of this – striping big fact tables across multiple nodes, and residing dimension tables together in a single node. It uses one “master” node for configuration tables. All in all, this does an excellent job of dealing with the major disadvantage of scale-out: the cost of intra-node network traffic for temporary datasets.
For custom data-marts, you will have to partition your own data, which isn’t a big deal, but does require a HANA expert. A good HANA consultant can define a suitable partitioning strategy in a very short period of time.
Remember that for scale-out, you will need one “hot-spare” node, and for BW you also need a master node, which is used for configuration tables and calculations. In effect, if you buy 5 1TB nodes (the minimum I recommend for scale-out) then you only get roughly 3TB of usable database.
The SAP Business Suite is more interesting, because data has to be grouped into sets of tables. This is discussed in SAP Note 1825774, but the short version is that it isn’t supported.
Should you scale-up, or out? Business Suite
The answer for the SAP Business Suite is simple right now: you have to scale-up. This advice might change in future, but even an 8-socket 6TB system will fit 95% of SAP customers, and the biggest Business Suite installations in the world can fit in a SGI 32-socket with 24TB – and that’s before considering Simple Finance or Data Aging, both of which decrease memory footprint dramatically.
My advice is to conduct a sizing exercise to decide what size you need today, and to buy this size (assuming you are a mature customer and not greenfield). It’s not necessary in most cases to worry about RAM to expand into, because you will naturally undertake optimization projects, which will reduce your memory footprint as you grow.
Should you scale-up, or out? BW and Analytics
This is a more subtle question. My advice is to scale-up first before considering scale-out. With scale-up, you don’t have any of the expense of GPFS or a SAN, and none of the complexity of managing a cluster.
With BW, you also have the option of the IQ NearLine store, where you can store cold data at very low cost. You should consider implementing BW NLS before considering scale-out, it is much more cost-effective and will increase HANA performance. With HANA SPS09 there is also Dynamic Tiering for BW, which allows PSA data to be persisted on a warm store, further reducing HANA footprint.
In addition, there is a new feature called the Inverted Index in HANA SPS09, which shrinks tables by up to 40%. This isn’t supported for BW yet (no doubt it will come in a future patch), but it is for data-mart scenarios. In BW 7.4 SP08, there are further features to migrate row-oriented tables to the column store, further reducing footprint.
SAP are continuing to invest in ways to reduce the HANA memory footprint – better to keep on top of these than to scale-out.
If, given all of this, you need BW or Analytics greater than 2TB, then you should scale-out. BW scale-out works extremely well, and scales exceptionally well – better than 16-socket or 32-socket scale-up systems even. Just remember that a 2TB scale-up system can be bought for $100k, but a 4TB (2TB usable) BW system costs $500k, so your costs will increase.
Don’t consider any of the > 8-socket systems for BW or Analytics, because the NUMA overhead of those sockets is already in effect at 8-sockets (you lose 10-12% of power, or thereabouts). With 16- and 32-sockets, this is amplified slightly, and whilst this is acceptable for Business Suite, but not necessary for BW.
Does the Hardware Vendor matter?
Various details of the different hardware vendor offerings should have come out in this blog – there are pros and cons to all the hardware vendors, especially for large Business Suite on HANA systems.
That said, especially for SAP BW, we have used all of the hardware vendors, and all of them work great, when set up correctly. The biggest variance we have seen is the quality of implementation by services professionals. A badly designed and maintained HANA system won’t work well! In days past, the hardware vendors might not install them correctly, but that doesn’t happen often any more.
For ultra-high-end use cases, there are specific things that can be done (better networking, SSD storage) and HANA appliances that perform extraordinarily well can be built using Tailored Datacenter Integration, but for standard use cases (1-10 billion rows, 1-10TB), there is no need for this.
If you don’t need more than 2TB in the short-mid term, then don’t buy scale-out. Instead, buy a scale-up system that will meet your requirements today, and replace with scale-out later on if you need it. The money you will save on your balance sheet for depreciation of the scale-out hardware will pay for the 2TB appliance!
But if you do need > 2TB of HANA, then scale-out is the way forward. It works exceptionally well and you will get near-linear scalability for complex queries.
HANA Product Management have issued a HANA SPS09 Scalability document, which is worth a read.
I hope this moves your thinking along, do feel free to ask any questions below!