Fast Is Not A Number
Let me preface this post with a few disclaimers. I’m an SAP Mentor, an SCN forum moderator, an ASUG volunteer, and I work for a company that runs SAP. What follows is my personal opinion, or rant if you will, and represents none of the preceding responsibilities. I’m no SAP HANA expert; I do have experience in capacity planing at the enterprise level. When the Business Suite on HANA was recently announced, my question was: what are the numbers? Before I evolve that question further, some background on benchmarking, and on the hype that surrounds new technology.
Measurements and comparisons occur constantly. When I was growing up, it was the era of “muscle cars” and the benchmark number for automobiles was “Zero to 60”. How fast could a vehicle go from a dead stop to what was then the general interstate highway speed limit? The implication was, the faster your car could accelerate, the easier and safer you could maneuver in traffic. But it was really a child’s game (“My car is faster than your car. Neener neener.“). Then the first gas crisis hit, and the benchmark number was MPG – how far does your vehicle travel with a finite amount of fuel? To get back to HANA for just a second, if MPG is the metric, would it makes sense to ask what the MPG of an electric vehicle is? Probably not, though distance traveled per money spent, or something like that, would certainly allow comparisons of varied components.
The first computer benchmarking tests I recall learning about was back in the day of the Radio Shack TRS-80, if not the first computer anyone could purchase, then close to it. I had one (before 1980) and learned to program in BASIC on it (though I had learned FORTRAN way before that). I was at a computer club meeting, and yes, there were so few of us we needed to form clubs. One of the members asked how long the BASIC interpreter took to dimension variables. I recall being a bit perplexed by the question, and had thought this operation was, well, so fast as to be instantaneous. But, as it turned out, it wasn’t instant, it was quite time-consuming to allocate a lot of memory. The classic statement (including required line numbers) would look like:
20 DIM H(35) |
(Page 37: http://bitsavers.trailing-edge.com/pdf/dartmouth/BASIC_4th_Edition_Jan68.pdf )
Adding more memory to a program than was needed would add time to the program execution. A simple benchmark program of one dimension (DIM) statement would indicate how quickly the processor, using the instruction set provided, would do a certain amount of work. As I recall, the scalability was linear: if it took 10 seconds to allocate 100 array entries, it would take 100 seconds for 1,000 entries.
I won’t bore the reader with more “tales from the old days”. I’ll just say I’ve collected, run, and recorded, many benchmark programs, from the Sieve of Eratosthenes to ABAP counters (see:SLEEPing on the JOB ). Sometimes what I find is just what the literature claims (yup, it goes from 0 to 60 in 7 seconds, or it gets 40 miles per gallon), and sometimes I find things that the account rep left out of their pitch.
What I can’t run is an SD benchmark. Presumably, nearly everyone in the SAP world knows what this is, and what SAPs ratings are. If you don’t, take a gander at – http://www.sap.com/solutions/benchmark . The earliest results on the two-tier benchmark page, from 1996, show 12,000 to 76,000 fully processed line items per hour. For nearly 20 year now, these tests have shown what improving hardware and software can do. Results are still being certified, and as I write this, the most recent values are from a week ago, and the champion at order processing clocked in at nearly 14 million line items per hour. By my watch, that is an increase of 3 orders of magnitude.
The reason I can’t run this test is that a benchmark test result, I’ve been told, could budget out at close to $1,000,000, counting hardware put on the shelf instead of shipping to a customer, and, primarily, the expert(s) time to run iterations in the prescribed manner to obtain official SAP sign-off. What you can’t see on the SD benchmark certifications are the tricks used to increase the numbers, allowed under the rules, but not something a typical customer could or should do.
I have asked for SD benchmark numbers for “Suite on HANA” since it was announced, in private, and in public:
Does anyone have data from an SAP SD Benchmark for HANA (now that it’s all running there, right)? See: http://www.sap.com/solutions/benchmark/sd2tier.epx … https://twitter.com/jspath55/status/291939620659269632
|
I’ve gotten pushback, or no answer. I am a little surprised, though I guess I shouldn’t be.
Here’s what I think may be the reasons this classic benchmark hasn’t been made public:
- The test results are worse than most existing hardware and database combinations.
- The thing just doesn’t scale the way it should.
- The mantra is “it’s really really fast” and you don’t need to put a number on “fast”.
I can wait for the hardware vendors to get their performance tuning teams all over this, like the pit crews at car races. In the meantime, here is why I think the numbers should be available:
- As it says in the blog title, “fast is not a number”. However the platform performs, slap a (repeatable) number on it.
- People who do capacity planning know one number doesn’t tell the story, any more than a miles per gallon rating on a car makes sense if you don’t fit behind the wheel.
- Though I’ve been asked “what do I expect the numbers to be?”, any scientist worth their pocket protector would tell you the data are the data. I don’t expect the results to be high, low, or exactly the same as an existing “legacy” R/3 system. Test it and share the results.
- SMART – measurable, repeatable, etc. If the number today is 2 million line items per hour, that gives the hardware and software mechanics a plateau from which to climb upwards.
Hopefully, some numbers (other than the BW-like ones that don’t matter to a transaction system) will be forthcoming.
References:
- The Transaction Processing Council – http://www.tpc.org/
- The “Byte” benchmark – http://code.google.com/p/byte-unixbench/
- Stream benchmark – http://www.cs.virginia.edu/stream/
- HBench-OS – http://www.eecs.harvard.edu/~vino/perf/hbench/
- Whetstone – http://www.roylongbottom.org.uk/whetstone.htm
And:
http://www.sap.com/campaigns/benchmark/appbm_bweml.epx
(Note this benchmark has one result only)
(image below linked from the sap.com/benchmark site – I liked the juxtaposition of charts and roadside billboard)
Fair enough, Jim. In fact, I completely agree on the point of view taken in your post and also on the demand for comparable performance metrics.
Personally, what bothers me even more is the (seemingly) willingness of so many HANA-involved people to forget about what may be called "good practices of software creation".
Test suites for the new development environment? Nope.
Actual proves for correctness? Nope.
Half-way intelligent IDEs that do more than syntax highlighting or keyword proposing? Sadly: nope.
And nobody seems to care... 😯 .
The plus-side however holds that many shortcomings have been fixed over the last year, e.g. the documentation had been improved a great deal, and that a lot of improvements are already "in the pipe" (to use sales slang for once).
As always, your post was well worth the time to read - thanks for that.
- Lars
Hi Lars,
When I saw your list of bullet points, the first thing I thought was "HANA - Excel for the Enterprise"....
Seriously, I would think CIO's etc will want to see benchmarks of various vendor's appliances and their configurations before buying HANA, to calculate the best bang for their buck.
hth
Ok - let's be a bit fairer here again... the points I mentioned are unfortunately true in general for DB centric development on all the major DBMS platforms.
So, it's not just HANA, but that doesn't improve the situation, does it?
- Lars
Hi Lars,
I agree with your thoughts. However it doesn't seem fair to compare HANA with DBs developed 20+ years ago. Those DB vendors developed their products using the best practices of s/w creation available then.
Today in order to woo the developers, in my opinion, HANA should be compatible with other products developed recently in the market.
Remember old DB products already have more developers than the demand whereas one of SAP-HANA team's objectives, as I understand it,is to make SAP-HANA environment more developer friendly.
Best regards,
Bala
Hi Bala
Not sure where we missed the others point...
Anyhow: the development environments of the main DBMS vendors surely are not twenty years old.
All of these products have had very recent and heavy development cycles.
So, it's definitively fair to say: this is the state of things as of today.
And the fact that HANA is new in the market doesn't really affect my argument.
What I'd like to see is that database centric development is treated equally serious as all other development approaches are.
Where are the wide-spread uses of schema-evolution mechanisms on db level? ABAP with it's DDIC has that since ages.
Where are the code analyzers that point out, where another common mistake is done? Standard in JAVA and C++ for years.
It's not difficult to extend this list, but the bottom line is: database development had always been kind of the step-child. It worked somehow, but rarely great and was very often pretty homegrown (anybody an easy click-and-go git-db-development-integration?).
- Lars
Martin,
Excel for the Enterprise sounds so much better than Apple for... Excel is already on everyone's desktops (yes, there are such machines still in use), laptops, and now even on WinPhones. Apple is staying at home and goes to some meetings and that's about it.
for the topic at hand, Jim raises excellent points and I tend to think that hard numbers are the most convincing and the most disputed points of customer satisfaction. i'm afraid that we will end up comparing apples and oranges as there are no two installations alike with all the hardware, network, software, and more software.
so, i'm still waiting for my BSEG in Excel benchmark.
Regards,
greg
I think my first, less than serious, point has touched a few nerves 🙂
FWIW, I was not thinking about HANA as a database when I wrote that line, I was thinking about BI and / or BW (on HANA or any other platform).
hth
Lars Breddemann and Bala Prabahar - First, thanks Lars for the feedback that you enjoy reading my posts. Besides getting something "off my chest", entertaining and/or engaging the audience are my target objectives. I can't match Thorsten Franz for his eloquence and depth, but I think I'm in a different plane anyway.
"It doesn't seem fair to compare HANA with [...]" - I disagree. I can compare it with just about any platform, and point out similarities and differences. When the marketing hyperspace engines seem to say HANA can do anything, it's fair game to bring up salient arguments.
Development platforms are certainly key to making a platform enterprise ready. My friends in the security world are asking similar questions, as are those in the change control (or version control) business. I'm limiting this current argument to a straightforward request for benchmark numbers that I could not run myself. If I had lots of time, I'd be getting access to a system and running a bunch of legacy code. Instead, I'm expecting the hardware wizards to do that.
Hi Jim,
I would like to limit this conversation on benchmark numbers except to share a quick thought on "pointing out similarities & differences." HANA & other disk-based RDBMS are different. HANA recommends running the business logic in DB whereas disk-based RDBMS recommend running it in the application layer. And SAP provides solid development environment(ABAP) in the application layer for disk-based RDBMS. We'll do more in HANA DB layer than in traditional databases. That requires a better development environment in HANA than other databases.
Best regards,
Bala
I stumbled across this, and while an old post, it is also spectacularly wrong yet uncorrected.
Using stored procedures and functions instead of chatty client server interaction will speed up any RDBMS, often by orders of magnitude. This news is 20 years old.
The sole difference here is that SAP only offer this approach for HANA. That is nothing to do with rival database capabilities, so it is clearly about competitive advantage for SAP.
What about scalability?
What about it? If you implement stored procedures and functions in row-store databases, they will likely scale better than HANA for OLTP workloads and equivalent resources. For analytic workloads, a column store would have the clear advantage.
PV
P - I wanted to clarify a few things.
When SAP first designed R/3, they decided that they wanted to be hardware, OS and database agnostic. The market was moving quickly and they wanted to offer customers choice and the ability to move.
In the interim 20 years, databases have become more sophisticated and the stored procedure languages have become more sophisticated. They provide an incremental benefit compared to application server processing, but SAP still has not supported SPs extensively, because they need to be written for each RDBMS.
Along came HANA. For the right process, if you push down into a SP, you get 100-1000-10000x faster operation - 3, 4 or even 5 orders of magnitude faster. This allows radical changes to processes.
So, SAP started to write SPs for HANA, to support the sort of new functions which are available. They have committed to allow these to run on other databases, if they can support it, and in fact there is a DBSL abstraction layer in Kernel 7.41 which allows exactly this - see ABAP Language News for Release 7.40, SP05.
Forget about row, column, OLTP and OLAP and think about how businesses operate - they don't make such discriminations, and neither does HANA 🙂
Jim,
Good post but I think that we need to consider a couple of things in order to be fair to all.
- SuiteOnHANA for the moment is maxed at 4TB System since SuiteOnHANA does not support ScaleOut Solutions yet.
- I would say that almost for sure that the performance differences in a Benchmark will not be too significant since all vendors have to build their HW to SAP Specs (same processors, same clocking, Same amount of memory per Processor / Core) etc.
- Differences will be probably more seen in real life scenarios where you do not only do one sort of operation but many different in parallel. Benchmarks are of the same level of truth than "MPG" Measurements of the car manufacturers, I have never seen anybody who achieves the MPG stated in the specs....
What indeed will be very very interesting to see is a comparison of SuiteOnHANA vs a "traditional" Server with the same number of CPU Cores than the corresponding HANA Machine but running on storage appliances like Texas Memory or Violin Memory.
It all remains to be seen. The good news is there is everyday something really new and exciting !
Carsten Nitschke
Well, I could claim that I do, but I'd need to prove it. As I said in my post, I am aware of certain "tricks" that are allowed to meet the SD benchmark (say, a huge number of interface cards) that real customers would hardly be able to duplicate. But, as with MPG, higher ratings versus lower ratings tend to manifest in actual use, so my hybrid car gets better mileage than my previous non-hybrid.
I find it extremely difficult to believe sales pitches with no real-world numbers. I'd not only want to see the comparative literature, I'd want to take the thing for a test drive. If I get great mileage, but can't merge onto the highway, I'll pass.
Jim,
real life figures in a sales pitch ? hmmm I reminds me of the famous Yodobashi where SAP HANA has increased the performance by 100k times. Not sure you realistic it is, might be similar to compare a Trabant (2 cylinder 2 stroke engine with 0-100km/h only with tailwind and some luck) to a Bugatti Veyron.
I have seen some real life example in tests with customer where the real performance increase on BW was more in the 20-80 frame than anywhere else. If that helps.
We will see where real life takes us and also if the perceived value by the customers is the value that we are positioning. Speed in the case of suite on hana might not be the fundamental value. In that case I am with John that the value will be more on the fact that you can do real time analytics on the data while it is being processed, which in fact is a very important value driver and very different from todays world.
Liking analogies:
- A Porsche Cayene is a fast car but more made for offroad and traveling
- A Porsche 911 Turbo is a very fast car, great for racetrack, not a comfortable for traveling and not really made for Offroad.
Wanting to say it is difficult to have just one medicine to cure all diseases.
Hi Jim,
> The test results are worse than most existing hardware and database combinations.
Well i would not call it "worse" ... i would call it "not as advertised".
I think this blog is pretty meaningful and states it well:
http://scn.sap.com/community/hana-in-memory/blog/2013/01/13/and-pushdown-for-all-news-about-abaps-adventurous-relationship-with-the-database
Marketing machine failed 😉 ... and the mentioned future solutions are just a spoofing (in my opinion). I am not quite sure about other database platforms than Oracle, but it is running much faster on non-HANA database platforms as well (e.g. just think about PL/SQL commit-time optimization by processing large amount of data or the PL/SQL array interface), if you shift some application parts / logic to the database layer.
OLTP workload is completely different from OLAP, but SAP has announced HANA as the egg-laying wool-milk-sow.
I am looking forward to see some real life implementations of HANA and not just that usual "slide polished stuff". Straightforward implementation stories and customers with real and not "SAP tuned" comparisons are very appreciated.
Regards
Stefan
Stefan - Ah, an "Eierlegendewollmilchsau", or, what I would call a "Shmoo (R)":
http://www.lil-abner.com/shmoo.html
😀
Jim
I know I was one of the ones that pushed back on you, for the following reasons:
- SoH enables things which aren't possible on other DB platforms
- SoH has the ability to do in-line real-time analytics on any measure which you can't do on any other DB platform
So in short, doing the SD benchmark will definitely show SoH at its worst. Of all the things it improves, SD will get the least because it measures pure transactional performance.
But with that said, the SD results tell you some important stuff and there are some limitations of HANA which might (or might not) be problematic. The biggest of these is that a single-node system has a maximum of 80 cores or 133k SAPS. The biggest IBM p-Series mainframe can crank out about 1m SAPS.
With HANA you can scale-out to multiple nodes but this (I don't think) will improve SD performance - it only improves performance when you can parallelize a query and for small queries, adding nodes adds latency.
So at least in my theory, HANA has an upper bound of SD database performance which is the maximum number of cores in a single node - which is about 25k benchmark SD users or 2.5m line items an hour, using a regular RDBMS on the same hardware)
That said, in the few 3-tier benchmarks that exist (and HANA is always 3-tier), there are typically 10-20 application servers attached to one database server. Assuming the same is true for HANA, this SD write performance may not be an issue in practice.
Anyhow my point is this: HANA's value proposition needs to be "we're as good as anyone else at SD, and we do lots of really great stuff they can't do as well".
And that's why I think you have a point: the SD benchmark matters, because customers needs to know it's at least as good as what they have for the basics.
John:
Thanks for your feedback, expanding on prior terse messages. You're definitely more the expert on HANA than I, and your extrapolations of existing data points to likely benchmark results are about what I would expect.
I can't argue with that (other than to say "things aren't a number" either). SAP is throwing many of these capabilities around. I would expect this to be true, but for a decision maker, it will be challenging to put those into the typical scoring matrix I've seen for ("Enter 1 to 5 for the relative ease of transforming lead into gold." :- ).
I can't argue with that either, and welcome the chance to hyper-accelerate many edge case reports that I know about. There are at least two "buts" here though (so I guess I can argue with that) - one being "but we already have BW on HANA to do that" and the other "but what will all those new reports do to my transactions?". I know the answer to the latter is likely to be "not much"; I must be devil's advocate to prevent a revenue system from becoming a boat anchor.
Once more, that would be a valid concern with bigger customers, and with those customers who dream to be big (which should be all of them). NUMA (see http://en.wikipedia.org/wiki/Non-Uniform_Memory_Access ) has been a valid concern for years. Hopefully, someone is working on metrics showing how much latency there is on specific reports or transactions for scalability projections.
Maybe, maybe not I'd hate to assume "no problem" and then be "that guy that ruined the cake" when practice doesn't match theory. My main point in this post is that we expect to see performance numbers, just like we expect to see acceleration and mileage numbers for a vehicle evaluation. Whether or not missing numbers should make one suspicious, it does make for a blank spot on any technical evaluation.
Jim / Community - On behalf of SAP, let me comment a few things. You raise a very fair ask. As you can imagine, we have been working flat out for the past year and were waiting for ramp up to define and do this work. The results we have seen internally, not only for SD benchmark but in general across all 23 processes range, from lower than std database, to comparable, to dramatically better and it all depends on the optimizations done and the type of workload. Also, as Bala, John and others have stated - both the SD benchmark and older database architectures are dated and we need to evaluate if this serves the modern customer requirements in the best way. We want to engage with the mentors /community over the next month to see the work needed here and then publish the benchmark results as we exit the ramp up phase.
Amit Sinha - Thank you for responding, on behalf of SAP. I know I am asking tough, and maybe tricky questions, and I respect your desire to communicate what may be a difficult answer to a difficult question. One can only speculate the possible replies others may wish to make, and the non-answers others may propose.
As the fictional detective Hercule Poirot was wont to say: "Fooey". The SD benchmark measures fully processed line items per hour. Perhaps some business models in enterprise software don't sell many items, but a majority do, I say. Measuring the ability to perform a basic business function is, well, basic.
Saying older databases are dated is tautological, and beside the point. We want to see measurements of business transaction speed, not the rapidity of archive log switches (though I also want to know about recoverability, but that's for another blog post).
None of those are numbers, either. How much lower? How much better?
I like to think of myself as modern, and while I posted this blog on my own initiative, I would think my company would want to know answers to the question I pose.
Let me digress into software architecture briefly. Having worked with R/3 for many years, I know about, but cannot profess to fully understanding, the verbucher process, translated literally from German to English as "Book keeper", but more loosely as "update program." The operation of the VB tables has not fundamentally changed in that time, probably for a long time before that, and I am guessing, won't change in Business Suite on HANA. I say this due to the deep embedding of the software model in all aspects of the millions of lines of ABAP code that support the successors to R/3.
4.5C
http://help.sap.com/saphelp_46c/helpdata/DE/e5/de872635cd11d3acb00000e83539c3/frameset.htm
(image linked from SAP online documentation in help.sap.com)
English version:
http://help.sap.com/saphelp_bw30a/helpdata/en/e5/de870535cd11d3acb00000e83539c3/frameset.htm
http://help.sap.com/saphelp_nw04/helpdata/en/e5/de870535cd11d3acb00000e83539c3/content.htm
Update requests, which form the key to logical units of work, are part of many SAP ABAP stack systems.
The point of illustrating the software logical units of work is to show that whatever platform the code runs on, it's highly probable that the fundamental building blocks of the underlying logic flow are not going to change, and tests of that software comparing transaction rates from previous to future environments is sensible.
Customers would want assurance that line items aren't duplicated, or lost, in a scale-out, ramped-up environment, and look at the numeric results of that growth curve.
Jim
Hi Jim,
I hope you don't mind me responding to your comment addressed to Amit. As stated before, this is a great blog with lovely comments from the members of this community. And I'm learning SAP-HANA. I don't know enough about SAP-HANA to speak with certainty. I thought I'll share my thoughts to your comment based on what I know.
Not sure I agree with this. I'll give you an analogy:
In '90s, when I wanted to buy tickets to go to India(probably true for domestic travel too), I would call a travel agent & would tell him/her the details of my trip.(Travel dates, names, from/to cities etc.). She would check her DB & would offer few options & cost. I didn't know what was possible or not possible etc. Either my choices were limited or I didn't know what to ask. So the transaction used to take 2-3 hours. Later I would receive my tickets by mail.
Fast forward to 2013, the time it takes to finalize my "travel plans" takes longer than it used to take in '90s. This is due to the number of choices I've. I spend several days on internet looking not only for best deals but looking to travel/break my journey somewhere in the middle (Europe on east coast or Japan/Ca via west coast). Again I'm not just looking for places of my interest but cost effective deals too. I'm not relying on my travel agent to tell me what's possible or what's not. I can't even explain to my travel agent what exactly I'm looking for. I truly feel I'm empowered due to what the internet offers;as a side effect, the time it takes to buy my tickets takes much longer. Sure, at very low level, the logic flow of buying ticket has not changed: 1) Decide travel dates/passengers/place etc 2) Provide travel details 3) Find out how much it would cost 4) Pay and 5) receive ticket.
However how I decide ( step 1) has changed so whole transaction takes longer now.
I know you're talking in terms of transaction speed by computer using SAP-HANA. All I'm trying to say is that we shouldn't just be looking at the transaction speed but what else SAP-HANA offers. May be there is no difference in transaction speed but what if SAP-HANA empowers business people by providing data "when they want it & how they want it"(is this marketing hype?). I'm sure you would agree when we try implement something new, we should look at everything(yes, using numbers & sometimes user experiences) that solution offers. Just like what you said, sure SAP-HANA may be very fast but if the line items are duplicated or lost, does speed matter? Similarly for the sake of argument, let us say SAP-HANA is not faster than disk-based databases but what if it "empowers"(not a number) the business people to do their job more efficiently.
For the reasons I stated, I agree with Amit's statement. I've been hearing a lot about SAP-HANA. Some of that could be marketing, some could be facts now, some could be real in near future etc.
What does the "modern customer requirement" mean? I don't know what Amit meant but in my world, "internet travel planner" is modern one. Similarly is SAP-HANA going to change customer requirements? Based on what I hear/read, yes, it is going to change. How is it going to change? No one probably knows if my interpretation of what Amit states "we need to evaluate" is correct).
Best regards,
Bala
Good Blog post, Jim. The point is valid, the AP/AR/SD/SRM users will want to know, if their daily transactions and batch uploads will move faster and that the financial close will be quicker. I used to be involved in getting hardware certified by SAP. These days, like you, I am involved in purchasing hardware for running SAP. The SAPS number is the basis of sizing an environment for transactional ECC as well as for analytics in BW. This is especially true in migration to a different platform. The sizing is for CPU, memory and IO subsystem, with choice of database as a data point. Benchmarks for flash memory would fall in this category. HANA seems different.
If SoH is a product materially different in the underlying code base and functionality offered, than the traditional Business Suite, then I can see the need to develop different benchmark/s. This will become important as number of vendors offering SoH and use cases in different permutations/combinations increases. So they are coming at the end of ramp-up then.