It’s that time of year again, and there is a new release of SAP HANA. If you’re into that kind of thing, you can download all of the reference material here. There is 30 PDF files containing 150MB of detailed information explaining what’s going on. Last year, SAP were kind enough to release SAP HANA SP05 during the Thanksgiving holiday weekend, leaving plenty of time for those of us with nothing better to do, to read up on it, but they were a week later this year, so I’ve had to catch up fast!
My sense of what happened during the planning phase for SAP HANA SP07 was the teams got together and said “what shall we achieve?”. The consensus was: let’s make everything better. Let’s look at the stuff it brings.
Better SQL optimization and performance
This is huge, for me. The SQL optimizer now does a much better job of guessing which engine to send a query to. I’ve not tested this exhaustively but if you are using SQL, this can provide up to 100x improvement in performance of OLAP-style queries and SQL is now only about 10% slower than an equivalent HANA model. The Model is still faster, because the optimization is built in advance.
In addition, COUNT DISTINCT is now 50% faster than I found previously. This is very nice, because this was a pain point for many customers.
All in all this is an important set of improvements to the core engine.
New SQL Features
There aren’t a ton of new SQL features, and I was hoping for more ANSI SQL compliance but there’s some interesting stuff:
– Ability to replica tables between nodes to improve join colocation
– Handful of new SQL functions for working days, currency conversion, grouping, SHA Hashing, Binary Conversions
– New BINTEXT data type and various conversion and setting functions
Unfortunately recursive queries and CTEs still aren’t supported, which is a shame.
There is a new debugger and editor but I don’t see any new SQLScript functionality, which is disappointing. I was hoping for some new functionality especially around UDFs and passing of arrays, but the reference guide looks unchanged since SP6. Hopefully this will be a focus for SP8!
Improved Developer Experience
SAP has majored on this in SP7 though I think there is more wood to cut for SP8. Thankfully, the regi configuration is now gone, though regi appears still be there in the background for certain tasks. This should pave the way to the Mac OS X version of HANA Studio (yay!). I tested checking out some projects and it happens 5-10x faster than before, which is very nice.
The 3 views: Project Explorer, Repositories and Systems View all still exist, which is a shame, but hopefully we’ll see those consolidated in SP8. Either way, from a developer perspective, this all feels much better.
The web-based IDE has been improved some although it still falls short of parity with SAP HANA Studio. Some serious work needs to happen here to make HANA a cloud development platform and crucial HANA Models are still only visible in XML.
Core Data Services gets some new functionality for the definition of relationships and views, but CDS is still very young. I suspect that the unification of the developer experience between the 3 different views, CDS, the HANA Analytic and Calc Views will be a core effort for SP8. Getting this right is the absolute key to an amazing developer experience for Native HANA.
HANA Studio has been nicely smoothed around the edges and I’d be very happy to develop large apps with large number of developers working concurrently. It feels extremely solid and it is much better integrated than other development environments. It feels like developer experience has been put to the top of the agenda. Deleting stuff is easy, for example!
Most of the effort in XS seems to have gone to make the experience more unified, with some additional features like validation and associations. I suspect that most of the work in SP7 went behind the scenes, paving for innovation in SP8. We will see.
There aren’t major changes from a modeling perspective, and a few features I saw in beta phase didn’t make it to the release. But, it feels faster because some things happen in the background, and they have worked on usability. Plus, they fixed a nasty bug in Data Preview where the dataset was always fully materialized. In practice, this makes working with large datasets much easier than ever before, which is excellent news.
Also I seem to see improved performance for filter pushdowns, which again makes working with large datasets an improvement. Anyone who saw my series of blogs on Global Warming will know that I had some fun getting amazing performance with billions of rows. My weather dataset seems to perform nicely better with SP7.
I’d have liked to have seen new aggregation types for AVG, COUNT DISTINCT, Weighted AVG but I guess those have to wait.
Instead, we have some much wanted bits of functionality like copy/paste, keyboard navigation and where-used. Yay. Decision Trees now open in a new modeler, which I haven’t had a chance to use yet.
New: The Spatial Engine
Beta support for spatial objects was introduced in HANA SP6 and this is now fully supported in SP7. I haven’t had a chance yet to delve into the Spatial Engine in detail, but it looks like the same functionality as SP6 based upon the reference guide. There is a lack of real-world examples for the Spatial Engine, but it promises to be extremely powerful because you can in-line spatial objects into regular tables and then use ESRI geospatial data for analysis against regular relational database tables.
There’s no other platform on the planet that allows this as far as I know.
Massively Improved: Text & Search
I haven’t had a chance to look at this in detail yet but I’m surprised by the improvements in the text engine. It is now possible to customize the text engine to do the analysis you need and there are substantial enhancements to full text search functionality. In addition there are major new Fuzzy Search improvements.
There aren’t more core languages supported but focus has been put on Russian, Japanese and Simplified Chinese, for Social Media analysis.
Improved: Predictive Analysis Library
There are 7 new algorithms: Statistics (Univariate Statistics, Multivariate Statistics, Chi-squared Test for Fitness, Chi-squared Test for Independent, and Variance Equal Test), Partition, Support Vector Machine (SVM), Forecast Smoothing, Substitute Missing Values, Affinity Propagation and Agglomerate Hierarchical Clustering.
Multiple Linear Regression, Logistic Regression, Apriori and C4.5 Decision Tree and CHAID Decision Tree have some much-needed improvements like support of p-value for coefficient and missing value handling.
I have no doubt that SAP’s KXEN acquisition has allowed a very focussed activity on what improvements were required here.
Massively Improved: Smart Data Access
Smart Data Access – HANA’s federation engine – was first released in HANA SP6 and it was an initial release. It looks like with SP7, SDA is massively improved. This is now one of the jewels in HANA’s crown.
– Support for Calculation Views including filter push-downs
– Oracle and MSSQL Support as well as generic ODBC support, for any database
– Support for Insert/Update/Delete – previously only SELECT was possible
– Caching with Hive
So this brings quite serious data federation scenarios to life.
What’s really needed in Smart Data Access is temperature control – the moving of cold data out of HANA and into the remote data source. For EDW scenarios, you could bind together HANA and IQ and have a very cost-effective solution. Very exciting potential development for SP8.
In my mind, HANA became truly enterprise ready in SP6, and that’s shown by the fact that SAP runs its own ERP system for all employees on the HANA platform. However, there were still improvements to be made and some of these have come in SP7.
– Improved Monitoring, Alerting and Tracing
– Death of the Statistics Server (leading to better resource usage)
– Storage Snapshots
– Automatic Backups for new Scale-Out Nodes and better support for Backup/Restore in HANA Studio
– Multi-tier System Replication, SSL, Zero Downtime and Compressed Log Transfer
This requires a whole blog to itself but it looks like almost any enterprise scenario is now supported. There is still work to do for SP8 – automated failover, encryption of log volumes and backups, active-active HA, query-able and writable standby databases, SNMP monitoring and a few other bits.
As I’ve written this blog I’ve been reminded what an amazing, deep and wide application platform SAP HANA has become. I’ve only had 2 days using SP7 so far, so this is just an initial impression, but it feels like an extremely polished release – the best release so far. What I like the most is that there has been a clear focus on quality-first rather than features-first – for instance, there was rumors of a graph engine, but the graph engine doesn’t appear in SP7 as far as I can see.
There is plenty of wood to cut for SP8 and I’m sure the HANA team is taking a brief breath of air, before they get on with the planning for the next release. Hopefully this blog helps them with a few places to focus 🙂