Who says HANA doesn’t work fast for OLTP?
Sometimes people think that because HANA is a columnar database, it doesn’t run fast for simple OLTP operations. I was just looking at a performance problem with class /IWBEP/CL_MGW_ABS_MODEL, method GET_LAST_MODIFIED.
This had some screwy ABAP, which is incidentally fixed in SAP Note 2023100 (thanks Oliver Rogers for finding the note), and it generated the following SQL:
SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c
FROM “REPOSRC”
WHERE “PROGNAME” LIKE ‘CL_ABAP_COMP_CLASS%’ AND “R3STATE” = ‘A’
ORDER BY “CDAT” ,”UDAT” ,“UTIME”
That SQL is pretty nasty, because it does a wildcard search on a big table. On the non-HANA system it was running in 20 seconds. I did the root cause analysis in the database and found that it was searching the primary clustered index, which was 98% fragmented.
Obviously I rebuilt the index – these are the results.
CPU time = 2750 ms, elapsed time = 2746 ms.
CPU time = 2594 ms, elapsed time = 2605 ms.
CPU time = 2750 ms, elapsed time = 2764 ms.
I realized at this point this was some bad coding, so I found the fix thanks to Oli and we put the change in. That fixed the performance problem.
But then I thought… what happens if you run this bad query on a HANA system? This is just what custom code looks like a lot of the time…
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” LIKE …’
successfully executed in 12 ms 414 µs (server processing time: 11 ms 613 µs)
Fetched 8 row(s) in 0 ms 68 µs (server processing time: 0 ms 0 µs)
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” LIKE …’
successfully executed in 9 ms 778 µs (server processing time: 9 ms 136 µs)
Fetched 8 row(s) in 0 ms 64 µs (server processing time: 0 ms 0 µs)
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” LIKE …’
successfully executed in 12 ms 677 µs (server processing time: 11 ms 830 µs)
Fetched 8 row(s) in 0 ms 56 µs (server processing time: 0 ms 0 µs)
So anyDB is averaging 2705ms, and HANA is averaging 10.86ms, an average speedup of 249x.
You may be saying… OK well that’s for poorly written SQL – what about when it was optimized. Sure, let’s test in that scenario. Here’s the SQL:
SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c
FROM “REPOSRC”
WHERE
“PROGNAME” IN (‘CL_ABAP_COMP_CLASS============CCDEF’, ‘CL_ABAP_COMP_CLASS============CCIMP’, ‘CL_ABAP_COMP_CLASS============CCMAC’, ‘CL_ABAP_COMP_CLASS============CI’, ‘CL_ABAP_COMP_CLASS============CO’, ‘CL_ABAP_COMP_CLASS============CP’, ‘CL_ABAP_COMP_CLASS============CT’, ‘CL_ABAP_COMP_CLASS============CU’)
AND “R3STATE” = ‘A’
ORDER BY “CDAT”, “UDAT”, “UTIME”
So ran it on anyDB, I couldn’t get accurate results from the SQL console so I had to use the ABAP trace to get the numbers. They were 5.504ms, 1.484ms, 4.605ms for an average of 3.86ms. Let’s see how HANA compares.
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” IN …’
successfully executed in 1 ms 977 µs (server processing time: 1 ms 156 µs)
Fetched 8 row(s) in 0 ms 63 µs (server processing time: 0 ms 0 µs)
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” IN …’
successfully executed in 1 ms 946 µs (server processing time: 1 ms 250 µs)
Fetched 8 row(s) in 0 ms 60 µs (server processing time: 0 ms 0 µs)
Statement ‘SELECT “PROGNAME”, “CDAT” AS c ,”UDAT” AS c ,”UTIME” AS c FROM “REPOSRC” WHERE “PROGNAME” IN …’
successfully executed in 2 ms 230 µs (server processing time: 1 ms 127 µs)
Fetched 8 row(s) in 0 ms 59 µs (server processing time: 0 ms 0 µs)
With HANA then, we get an average of 1.18ms for an average speedup of 3.27x.
Conclusions
For poorly constructed OLTP queries at the database level, we can get enormous benefits of running HANA – up to 250x or more. With optimized SQL that hits database indexes on anyDB, that drops to around 3.27x, but SAP have only ever claimed a 2-3x increase of running ERP on HANA for transactional workloads.
And remember if you move to the sERP suite, you’ll see another 2-3x because the data structures are simpler. That’s going to mean response times of 5-10x faster than on anyDB.
I don’t know about, you, but that feels significant to me.
Yes, I know I didn’t do concurrency, inserts, updates and all that jazz. This was just a quick test run with 15 minutes of spare time. Hope it is informative. It’s also worth noting that with program changes, I was in this case able to get acceptable performance using anyDB for our timesheet app. The only area where performance is a problem is for the WBS element search, which is a wildcard search again.
For those searches, HANA rocks. For customer, product – anything with a free text search, HANA is going to kill anyDB.
P.S. This was all run on HANA Rev.81
P.P.S Why not run this on your DB and let us know how it runs?
Thanks John,
Interesting blog and significant results indeed.
Hi John,
I understand that claims about speed increase are tested and I am not saying that it is not true, but I find your comparison in this case a little bit incomplete. From what information you provide, only the query is the same. We have no information about HW specifications etc. Moreover, I think it should be compared to its in-memory counterparts, not just any DB.
Yeah, HANA rocks but still, from this information, you can not pretend anything at all.
This is a little ditty and not a large-scale comparison. I'd sure like to do such a thing, but only had 15 minutes to spare and thought the results were interesting enough to write about.
The hardware was certainly different. The MSSQL has a much faster CPU (Ivy Bridge vs Westmere) but HANA has a lot more cores and RAM.
250x is still significant whatever the hardware.
Good information. I'm also starting to get into the OLTP read comparisons via some work with the HANA Application Accelerator. I am doing a lot of nitty gritty performance comparisons and am finding a lot of interesting results.
For example, in the pricing area where condition technique is key - there are a lot of non-key, non-index searches happening. In those cases, I am seeing HANA outperform anyDB by a minimum of 2x, with no code changes required.
I would agree with the conclusion that well constructed code should perform similarly on both systems, but that is rarely the case, even with standard code. @Shkelzen - rarely will you have an ERP system running on a non-HANA in-memory DB (do those even exist? 🙂 ), so think John was just trying to show a quick and dirty example. Maybe if you ar interested you can do some detailed head to head testing and report back some results?
Regards,
Justin
Hi Justin,
I know that there is no certified ERP system running on a non-HANA in-memory DB (some people claim those exist 😈 ).
Just to be clear, once more, I enjoy reading John and I am not pretending that what he claims is wrong.
We had a BW customer who put anyDB on a HANA-specification box. They believed that having a cache big enough for the whole DB would mean that anyDB performed the same as HANA.
8 months later, they migrated from anyDB to HANA and saw 30-100x increase in BW performance. Of course range selects are the jewel in HANA's crown, so BW sees incredible increases.
What surprised me was I didn't expect to see 250x increase in a metadata query.
Justin, I reckon it would be possible to use BLU shadow tables in ERP to test performance of anyIMDB... but I haven't had time to do this yet.
Hi John,
If SAP HANA is so fast with OLTP, why has SAP yet to publish benchmark results for it's own SD benchmark - http://global.sap.com/solutions/benchmark/sd2tier.epx? It's getting old to hear that SAP HANA is "different" and enables a completely different way to thing about databases and that the old benchmarks no longer matter. I'd love to see a heads up OLTP type benchmark run where you can really contrast OLTP performance in a reproducible environment.
If you could provide the data and code environment, I'd love to run SAP HANA against say SQL Server 2014 In-Memory OLTP on an identical AWS configuration. 🙂
Also, SAP HANA really needs to take a look at optimizing stored procedure execution. During some testing I did on build 76, stored procedures executed almost twice as slow as running the query that was part of the stored procedure. I had to get rid of all my stored procedures to get my app to run at top speed.
Regards,
Bill
Bill, aside from the benchmarking comment, I think it would be enlightening to set up a performance test between a recent HANA version and the IMDB SQL version you mention. I can help with the data and HANA side of things, what do you say?
Just to clarify, we would agree upon some OLTP type and analytic type queries and run them directly on the respective db. Thus we can remove noise from the app server side.
Regards,
Justin
Hi Justin,
As John mentioned below, publishing benchmarks - even in a blog like this could land someone it hot water, but you can bet the vendors are always internally comparing each other. They just can't publish results. This is why you have cross vendor "standards" like TPC and the SAP SD benchmark which require careful reviews of the results. Just as John suggests, SAP isn't ready to throw HANA into the fray. At least SAP Sybase ASE shows up to the table.
Regards,
Bill
You know, you get no-nonsense from me Bill.
I've not run the benchmark but I believe it's because:
1) SD doesn't run well on HANA
2) SD doesn't accurately represent how customers actually use systems in 2014
3) HANA does run well for how customers really use systems in 2014
SAP are in the process of creating a new benchmark which I understand will include mixed-workload OLTAP queries.
There is a BW-EML benchmark which does mimic how customers use OLAP systems in 2014. Unfortunately it has some design flaws which mean it can be gamed (for instance you can run loads on one area and queries on the other, rather than forcing both at the same time).
Benchmarking is much better done on real customer data with real scenarios. I have done a lot of it, but none of it is possible to share in the public domain.
Also, the license agreements with Microsoft/IBM/Oracle/SAP specifically don't allow benchmarking. That's why I referred to the database above as anyDB, so it can't be recognized.
Hi John,
I get it - I'm in the same business as you. 🙂 Using anyDB gets risky, but I wanted to take you up on your P.S.S. to run this on an "anyDB" and see what the results look like. (still would too).
Regarding your customer use of systems in 2014, we should all be running MongoDB or Cassandra if you believe all the schema-less (flexible schema) vendors.
I totally agree with you that benchmarking customer data with real scenarios is best. That's the great thing about vendor POC shoot outs. Too bad the results are seldom published.
Regards,
Bill
Right... I have a whole library of these from all sides in my document 🙂
The problem with PoCs is they never compare apples to apples. What a good PoC does is to validate a set of assertions, that are required to progress from a technology or business standpoint.
Trying to do a truly fair PoC is nearly impossible. In one recent one, we found HANA was 80-120x faster than anyDB for complex queries. Someone (rightly) pointed out that the HANA hardware was more powerful, so how could we compare.
You certainly can't say that because the HANA box was 2x as powerful, we should renormalize by 2x. HANA may not be 2x slower with half the hardware, and anyDB may not be 2x faster with double the hardware.
My point... you can't compare apples to apples, but however you look at it, HANA is incredible.
Sorry, I missed what you requested. It's REPOSRC... you can use any NetWeaver 7.4-based ABAP DB and run the SQL above.
Hi John,
I did test the query with a AnySql DB from Redwood on 7.40 Abap stack, as expected the results matches the one you get.
Hana is around 4 time faster... but the box I'm running the test on is a cheap basic Windows virtualized system with 32G of RAM, 4 Vcpu and a Iscsi attached SAN.
Would be interesting to know on which hardware you did run your test.
I guess it is at least 4 times more expensive...
I fully agree it's difficult to compare, but for that kind of simple query I'm afraid that the performance gain won't match the cost.
Further more the DB time is supposed to be only 40% of the response time on an ERP system.
Best regards
SELECT PROGNAME, UDAT, UTIME FROM SAPSR3.REPOSRC WHERE "PROGNAME" LIKE 'CL_ABAP_COMP_CLASS%' AND "R3STATE" = 'A' ORDER BY "CDAT","UDAT" ,"UTIME";
Elapsed: 00:00:09.98
SELECT /*+ INDEX(REPOSRC "REPOSRC~0") */ PROGNAME, UDAT, UTIME FROM SAPSR3.REPOSRC WHERE "PROGNAME" LIKE 'CL_ABAP_COMP_CLASS%' AND "R3STATE" = 'A' ORDER BY "CDAT", "UDAT", "UTIME";
Elapsed: 00:00:00.31 // 00:00:00.10
SELECT PROGNAME, CDAT, UDAT, UTIME FROM SAPSR3.REPOSRC WHERE "PROGNAME" in ('CL_ABAP_COMP_CLASS============CCDEF', 'CL_ABAP_COMP_CLASS============CCIMP', 'CL_ABAP_COMP_CLASS============CCMAC', 'CL_ABAP_COMP_CLASS============CI', 'CL_ABAP_COMP_CLASS============CO', 'CL_ABAP_COMP_CLASS============CP', 'CL_ABAP_COMP_CLASS============CT', 'CL_ABAP_COMP_CLASS============CU') AND "R3STATE" = 'A' ORDER BY "CDAT", "UDAT", "UTIME";
Elapsed: 00:00:00.04 // 00:00:00.01
Good article.. thanks for sharing