Native HANA apps, language containers and HANA Cloud: Part 1 of 2
Yes, I know, it’s a rather omnious title for a blog bost, but my goal is to provide a different view on the recent HANA announcements.
Business travel is always a great way to work on a new blog; lots of time spent at airports and in airplanes. Last week, I’ve been to TechEd Las Vegas, where I helped the SAP HANA Startup Focus Program (SFP) answer questions about HANA and presented about HANA as a platform. It’s amazing to see how the SFP has grown over the last few weeks. When we started the program in March, I’ve helped the first batch of 10 companies get ready to demonstrate their HANA-based solutions at SAPphire in Orlando. Since then, I’ve moved back to my role in HANA product management to define platform features for partner and ISV enablement. Kaustav, Ben, Hardik, David and the rest of the team have taken the fire and turned it into a volcano!
From TechEd, I went straight to visit one of our North American customers which has trusted SAP with its business for many years. They are just in the process of transforming their business and are evaluating HANA as an Agile Data Mart to quickly discover new patterns in ERP processes. HANA appealed to them because their new business model is such a dramatic shift, that it’s essential for them to determine quickly when supply chain, production processes, global transportation, trade compliance and market development are not optimally synchronized. They wanted the flexibility to quickly discover how core metrics in their business were changing during this transformation. Rather than building ERP-based reports, they will instead derive KPI metrics from the core ERP data replicated into HANA to monitor their business. They plan to initially provide HANA’s modeling environment to their business analysts, but will gradually build new agile applications on top of HANA in order to quickly adjust to new business conditions and best practices.
It was a little strange to move so quickly from startups, which are just trying to kick-start their business on HANA, to a very mature enterprise using HANA for an agile business transformation. But it’s pretty damn exciting. At the core is HANA as a platform for new type of business processes and applications, which brings me to the topic of this blog. There have been many announcements at TechEd regarding HANA as a platform that can be operated on premise as well as in a public or private cloud. I had many people ask me questions, so I decided to provide my own spin on these new technologies and deployment models. It’s obviously work in progress, and I’m sure more will come at TechEd/SAPphire Madrid. But for now, here is my take, which I didn’t try to synchronize with marketing, so we’ll side if I get my hands slapped because I’m not “on message” 😉
My hope is (beyond not getting my hands slapped) to get lots of questions from you, because we will be working on a public slide deck that will explain all this, so your questions will give us a sense of what to focus on.
The announcements were two-fold: First, there was a lot of talk about programming languages and language containers. HANA Application Services comes with JavaScript support, and HANA deployed in the NetWeaver Cloud comes with a Java engine, and as a result, obviously support for every Java VM based language like Groovy. Second, Vishal announced support of HANA on Amazon Web Services. In other words, NetWeaver Cloud is both a deployment model as well as a language offering for HANA, whereas AWS stays silent on what programming language to use. That’s not a surprise, because AWS is pure infrastructure, a machine and an operating system in a virtualized environment, whereas NetWeaver Cloud is both infrastructure as well as middleware.
HANA on NetWeaver Cloud also comes with its own Java programming environment based on Eclipse, whereas HANA on AWS offers HANA development tools and suggests HANA Application Services (aka XS) as a programming model – which is however not mandatory because you can certainly install additional middleware on the AWS instance to communicate with HANA and not use HANA Application Services.
That’s already a lot to digest, but now I want to throw in another topic into the mix: What is actually a “native HANA” application? I tried to write about both language as well as cloud infrastructure aspects of HANA in one blog, but decided that it would be too much. As a result, this part 1 of the blog will deal with language options and explain the concept of “native” applications, and in part 2, I focus on explaining the cloud deployment options for HANA.
It is certainly easy to see the point that an application that entirely runs on HANA, and that requires nothing but HANA is “native”. Effectively, this means that an application using HANA Application Services is per definition “native” – it doesn’t require an external Web server or middleware to connect to a Web browser, because it has a Web Server built in. Web requests are dispatched straight to a JavaScript engine which then allows to build the application logic (make database calls, generate HTML5 etc).
However, this Web server/JavaScript infrastructure is technically separate from the actual HANA in memory database kernel, and there are of course different access methods to the HANA engine than just through XS, like JDBC and ODBC. SQL code that is being executed in the HANA engine by means of JDBC from a Java middleware is executed with the same performance as if it was executed from within XS. There is just one big disadvantage of doing it through Java: Every SQL call through JDBC will require a roundtrip between the Java middleware and the HANA server, so if larger resultsets are copied, performance is lost.
Yet, that doesn’t mean that Java applications will always run slower through JDBC than on XS – it simply means that as an Java application developer, you need to avoid round trips, and the way you do this is by moving as much of the data processing logic where the data actually sits – into the database, by means of using SQL Script or the modeling capabilities of HANA. That has multiple advantages: First, the round trips are avoided, but more importantly, the SQL Script application code that ties multiple SQL statements together is executed as much as possible in parallel on different processor cores of the host computer. Any procedural statements within SQL Script are also compiled into machine code and are not interpreted at runtime, so it’s fast.
In other words, while a Java application accessing HANA is certainly not native to HANA because it always requires an additional middleware that runs outside of the HANA engine, a Java application can most certainly use a number of native elements of HANA that will make it much faster. But it comes at a price: It requires re-coding of the Java application and moving more processing logic from Java into HANA. You can’t have it all: If you want super-fast performance of data centric applications, the logic needs to shift into HANA and out of Java. For anybody who has ever worked with stored procedures of conventional disk-based databases, this is probably a well-understood concept.
As an extreme case, the Java application only does the HTML rendering – the rest all moves into HANA. You could say that this Java application is then “almost native” 😉
You could further ask whether any “almost native” application should become “fully native” by removing Java and utilizing XS for rendering instead. There are certainly additional advantages, like the fact that you wouldn’t need a Java middleware any more. Also, the XS JavaScript engine is very tightly integrated with the HANA engine, and such additional overhead through JDBC is avoided. XS is basically a very lean infrastructure, tuned for extremely high performance and low latency.
So, should it be a design goal for every application to become HANA native? My personal view: No. Java is there for a reason. It has different properties. It has maximum openness. It has a huge developer community. Even though SAP HANA may have aspirations to rival the Java community at some point, forcing every Java developer to immediately convert into a HANA developer would simply be the wrong approach. It also wouldn’t make sense, and the best way to explain why it doesnt make sense is from the perspective of the startups we have worked with.
A lot of the startups we have worked with came with a huge Java code base. If we would have required each startup to completely rewrite their application from the ground up to become a native HANA application, we may not have dozens but only a few startups working with HANA right now. At the same time, we have also delivered to the startups a very important message right from the very beginning: Don’t assume that just because HANA runs in memory, that connecting an existing Java application to HANA as a database will make the application run faster. Yes, it will likely run faster, by a factor 2 or 5, but it won’t get the dramatic performance improvements we were seeking in the program. In fact, we have preferred startups that had a problem that couldn’t be solved with conventional disk-based database technologies. They were most motivated to rewrite those parts of their application that by moving data-centric code into HANA would get performance improvements by a factor 100 or 1000.
At the same time, an application is more than performance-critical algorithms. It may have an administrative user interface. A workflow engine. Things, that just don’t need to have performance improvements, and that were just “good enough” the way they were written right now. Those could simply move their data into HANA, but leave the Java code untouched. The only issue we have encountered is that lots of startups have used Hibernate as a persistence technology, and unfortunately, there is currently no Hibernate database driver for HANA.
One final thought before I conclude part 1 of this blog: There is also another reason why not every application can become a native HANA application at this point: It is just a fact that Java as a language is more expressive and allows to implement more complex algorithms than SQL Script. SQL Script is what it says it is: A scripting language. It doesn’t have classes. It doesn’t do dynamic creation of objects etc. Even the combination of JavaScript and SQL Script together in XS may in certain cases fall short of what a developer can do with Java. JavaScript is a full programming language, but it is also weakly typed. You can implement object-oriented concepts on top of the core JavaScript engine, but the performance will never be as fast as a JIT-compiled strongly-typed, truly object-oriented language like Java.
From that, there is a certain dilemma: Keeping too much logic in Java will never get you to optimal performance results in HANA. At least when it comes to data-centric algorithms, there will be just too many round trips between HANA and the Java algorithm. You can move certain data aggregations functions into HANA and keep the core algorithm in Java, but that’s not ideal. But moving code from Java into JavaScript/SQL Script will likely hit the limits of the JavaScript interpreter. I don’t have proof for that, because XS is very new, but since I’ve written significant amounts of JavaScript code in my past, I don’t think it could be any other way.
I know what you will say, because several people have asked me that already: Why not bring a Java engine into HANA? To that I will only respond with this: It’s not just the language, it would also require a tight integration of the Java Virtual Machine and the HANA engine, so that the JVM had knowledge of how HANA stores its database tables. For that to happen, Java would have to be open and free, because under the current licensing regime such a change would not be possible.
Since this dilemma exists, what HANA has done instead is to move as many standard data mining algorithms as possible into HANA itself, thus allowing application developers to use libraries such as the HANA Predictive Application Library. But for custom algorithms, SAP can obviously not offer an out-of-the-box solution in HANA. They will have to stay in Java, thus cannot be executed natively in HANA and therefore doesn’t yield optimal results.
I hope that you now have a good sense of the concept of native HANA applications, and the pros and cons of the various languages with which native and non-native HANA applications can be built.
With that, I leave you with hopefully lots of questions to ask, before we continue with part 2 of this blog, which deals with the cloud infrastructure options supported by HANA, and their various properties. You will see that the background of the programming languages and language containers will help you significantly to understand the infrastructure options as well.
Great blog with lots of important details. We need more information / blogs like this one.
My last comment was about my 5 star rating...
🙁 If you would have finished this blog sooner, my blog yesterday about HANA XS and NetWeaver Cloud probably wouldn't have been necessary. This blog clears up many of the open points that I discuss in my blog.
I like and agree with your arguments - I especially enjoyed the details in your comparison of Javascript on HANA XS vs Java on NetWeaver Cloud. Ideal would be some sort of a decision tree based on your analysis that assists developers to decide which platform they should use.
The problem is going to be converting this text-heavy blog into a slidedeck that you can use in Madrid.
I'm eagerly awaiting your next blog concerning cloud infrastructure issues and HANA.
D.
Yeah, I know - I just saw your blog from yesterday, but need to read it in more detail. Too tired right now. It must be incredible hard to parse out a set of announcements and puzzle together what is going on underneath the cover. But from what I read, there were some statements and comments in the blog that showed that you and the folks got it. I think Ethan said something about XS being 2 1/2 tiers - and I think he is right.
It's funny, because I had the same thought when I wrote the text. We need some sort of decision tree. The truth is, I am a horrible slide creator. When I try to produce slides just from what I have in my head, it usally ends up in disaster. But when I take my time and write the string of logical arguments together first, the slides are just a nice visualization of what I had written down.
So, hopefully, others will find my text equally logical, because then I can just proceed to drawing a bunch of nice pictures.
The second part of the blog is already written. I just want to have it settle in my mind a bit and give people some time to digest part 1 first -- then I'll post it. Maybe Friday or over the weekend.
-Michael
True but that is what I really enjoy and why I like to blog. It is similar to a puzzle in a computer game - you are given a few pieces of information and you have to try and solve it. This challenge probably isn't desirable for SAP - messaging / marketing should always be clear for customers but in this age where change is constant, it happens often.
What is missing in your blog is a discussion of the business relevance on the platform selection. If I'm doing medical predictive analysis, then I need platform X, etc. Or are technical factors (I just have Java developers, etc) the only reason why one platform would be selected?
D.
This blog has has cleared allot of things up for me... And a decision tree for devs and customers alike would be fantastic. 😉
This is a great blog Michael and it hits a ton of points where there is a huge amount of confusion right now. Thanks for writing it! I think SAP should get you involved in vetting press releases 🙂 Maybe they would start to make a little more sense that way. I know, that's probably your worst nightmare, but I think those press releases probably caused a lot of confusion (and therefore damage to SAP's ability to sell this product).
One thing that surprised me here was the discussion of XS as Javascript-only. Unfortunately there is no documentation available that I can find. I'm waiting for the release of SP05 and accompanying docs. However, several people I talked to at TechEd told me that XS (or at least HANA with SP05) included a Java application server. This didn't make sense to me, but I heard it often enough that I started to believe it. If XS is really only a Javascript/REST engine for executing SQL and SQLScript, then I'm much more comfortable with the direction SAP is going here.
One thing I continue to find problematic is the inclusion of Netweaver Cloud under the moniker "HANA Application Services" as part of "HANA Cloud". I like your explanation of the use of a non-HANA Java application server with a HANA database above, and Netweaver Cloud's architecture strikes me as a very similar setup. It seems to me that the Netweaver Cloud PaaS is way more than HANA, and HANA is way more than a database for the Netweaver Cloud PaaS. (Compare to Google's AppEngine and Google's BigTable/Spanner, which are in a very similar relationship.) Why subsume one under the other by naming it all "HANA Cloud"?
Thanks,
Ethan
Thanks Ethan! No, it's not that I hate writing press releases. I've done it before during my open source & standards days. I enjoy writing & language. Unfortunately, for corporate press releases the branding often seems to take precedence over technical reality, and if you take pride in the latter, it's indeed better to stay away from the relatively abstract press release process.
I was also afraid that subsuming NetWeaver Cloud under "HANA Cloud" would be confusing, but what in the end took priority - from what I understand from our marketing folks - was the desire to put HANA into the center of the announcement, and position NW Cloud as the extension platform when consuming HANA through open languages like Java. That's one way to look at it, but the problem is that NW Cloud has had such attention already in SAP community circles that it became confusing. And one of the major issues with it you have identified right there: That NW Cloud PaaS is way more than HANA, and HANA is way more than a database for the NW Cloud PaaS.
Yes, XS documentation is not available, partly, because it is only available as a package in XS. So, in order to read the documentation, you need to have an XS install, and while the AWS instances contain a snapshot of XS, they don't contain the documentation package. Once SP5 is released, this will get better.
Either way, I guarantee you that XS does *not* contain a Java application server, and I'm sure you know that beyond technical reasons, the restricted licensing options for Java would put up a huge impediment to do so. But it's also obvious why people thought so: If you subsume NW Cloud in the HANA Cloud, and if NetWeaver Cloud is even included in a box under the moniker "HANA Application Services", thus suggesting that Java would be offered as a language somehow *integrated* into HANA, it's not wonder. Sigh.
That every Java person would *love* to have Java inside of HANA, and deeply integrated, including some sort of native access from a Java language extension into the HANA table memory space goes without saying. But as I said, I don't see it happening.
Cheers,
Michael
Thanks Michael for posting this as it clarifies a lot cloudy discussions 😀
To me it seems like a good rule of thumb, if one has existing Java apps or Java is the language of choice, could be -
To have the DAO layer (at present it could be some kind of ORM like hibernate or custom code) stripped away from Java and moved into HANA while keeping an eye on volume of data transfer between the app and HANA.
Question then boils down to how to do this - JDBC, XS layer serving data 🙂 , or native REST interfaces(do they exist?) I don't have an answer to this and will be looking for it.
Hi Pankaj,
the volume of data transfer will be one consideration in terms of where you put the processing logic. It's doubtful that HANA will play out its advantage over disk-based row stores if your application just does single selects or pulls out a couple of records from the database per request. HANA will shine when you do a lot of aggregations on the server, and you have a lot of data to aggregate.
Not sure what you mean by stripping the DAO layer away from Java and move into HANA. Are you suggesting to replace JDBC access with a REST based interface? Sure, you can do that, but I'm not sure if it really adresses the problem that you don't want to pull too much data into Java to process it there, and instead move most of the processing logic into HANA. Whether the data transfer from HANA into Java happens through JDBC or REST seems secondary to me.
Yes, there are OData based REST interfaces available in XS.
Hi Michael,
excellent blog and hat-tip for writing it! I see no reason why marketing should slap your fingers as the way I see it you just did them a huge service by writing this post!
While there's little to add to your blog I still have a few minor remarks to make. First, I believe a simple rule of thumb would be that data-centric applications would be a better fit for HANA native apps and having them exposed via XS, while scenarios that contain business logic (which may not be pushed down to the database level) or apps that touch more than just one system (we referred to those as Composites) may be better suited to be developed on NW Cloud. Hope you agree to that simplified approach as a general approximation.
One last comment, NW Cloud is a platform only that sits on the infrastructure. In fact, we have clearly decoupled the PaaS from the IaaS layer. For SAP's cloud strategy IaaS is not a technology we want to differentiate hence we team up with partners such as AWS. For the time being we run on the infrastructure setup for ByD, yet we always designed as such that we are able to run on other infrastructure a well.
Again, kudos for this blog - looking forward to part 2.
Cheers,
Matthias
Hi Matthias,
thank you - glad you liked it!
I would hope so too that marketing sees that I'm doing them a favor. But as I already noted to Ethan my post also exposes that marketing positioning should not be confused with precise technical reality. Not that these guys don't know that themselves, but I also intentionally went "off message" in order to explain what the precise technical reality is ...
As far as data-centric applications vs composite apps: You can certainly also pull data from different systems into HANA like into a data warehouse, and then build the composite app on top of HANA. So, I'm not sure if that's the real difference. The real issue is how much data we are talking about. I doubt you would build a composite app by pulling gigabytes of data from different systems into Java first, then processing it there. You would rather do that in HANA. But if you would pull maybe a few megabytes of data from one system and integrate it with another few megabytes of data from another system, then I would think that pulling this tiny amount into HANA would be complete overkill.
Yes, I also thought that NW Cloud (or rather: the Java PaaS parts of NW Cloud) could be moved to another IaaS layer, like AWS. However, I'm not sure how difficult this would be, and whether there are concrete plans to do so. Last I heard was that there were no concrete plans at the moment to support AWS as an IaaS layer for NW Cloud. Also, how tightly are the whole virtualization APIs to spin up a new instance for Java PaaS bundled with the current virtualization software of the ByD cloud? Could the same thing be done with AWS, especially because AWS has it's own set of APIs? I suppose it's feasible, but I'm not sure how much effort it would be, and whether there are any plans to do so.
Cheers,
Michael
Dear Michael,
Really Informative article. Big Thanks !
Regards
Aby
Hi Michael,
Excellent Information,I learned lot of new thin on HANA...keep post on HANA Blogs more....
Thanks,
syamallu
Good Information!
Nice blog - thanks! 🙂