Skip to Content
Author's profile photo Former Member

SAP’s HANA: Removing the Shackles from Terrible Data Design

I’ve been talking a lot at work lately about Agile BI. The solution I’ve been trying to “sell” internally is to use Sybase Replication Server to make real-time copies of our transactional systems into a Sybase IQ-powered operational data store (with minimal data cleanup) to serve as operational data stores (for more of my thoughts, go Sybase Love Fest Part 1 – Data Solutions). I’d build the lion’s share of the semantic layer right on top of this copy and only use traditional ETL for the most complex and least performant use cases. Basically, this makes BI “agile” because you don’t have to wait on time-intensive ETL to be developed and you can basically whip up a smoking-fast reporting copy of any database in a few days. We’ll call that scenario HANA-light, and I believe that is the near-term solution for most non-ginormous enterprises.

The Pro’s of HANA-light are that we wouldn’t need to report directly off of the transactional database (since we have an up-to-the-minute copy) so we won’t impact performance on it, the reporting should be way faster (since it is stored in IQ), and we won’t need very much ETL (which is a necessary but slowly-developed evil). The only real Con I can think of (besides additional cost) is that the semantic layer will be really tough to build on top of the transactional schema than on a more traditional data warehouse type schema of almost any type.

Why are transactional data models really hard to report off of you say? Because application developers build (currently out of necessity) really crappy data models. Data is stored a hundred times in a hundred ways so that the user experience is fast but the master data management is a painful process. Data is stored in painfully normalized (or denormalized – typically whichever makes the least sense from an analytics perspective) which makes ugly, inefficient, and often not-reliably-correct multipass queries necessary when trying to actually analyze the data. Finally, data is stored in [please feel free to insert your own “trying to report off of a transactional database” horror story in here]. The bottom line is typically that app guys/girls need to design their database in such a way that reporting folks throw up in their mouths a little bit when they see the schema because no matter what the application needs to respond FAST.

(And why do we in business intelligence allow application developers to get away with this? Well, quite frankly we are red-headed stepchildren, to use the parlance of our time. The common convention is that getting the data in is the most important thing, and the data monkeys will always figure out some way to get it out. With existing databases, that is a very valid point. Also, app developers are sort of like the bass players of the IT world: they always look cool and mysterious and get all of the groupies. Sorry, I think I’m spiraling here.)

Fortunately, because HANA can query just about anything super-fast, and because HANA — being just a database — will soon be used to build applications and not just data marts, application developers no longer forced to create stupid data models. If the BI data modelers are brought in earlier in development, not only would agile BI be a given (even less ETL would be required, building the semantic layer would be a snap, and BI would already know the layout), but think about how clean the master data management would be. Each entity or lookup value would have one record (although admittedly slowly-changing dimensions or some other method would need to allow for them to change over time) and it could all be managed from one location and we would KNOW that there would be no confusion!

(Will this make apps harder to build? Maybe a little, but the best application developers will adjust very quickly. See how I just set up application developers to sound like they aren’t very good if they disagree with me? Nice, huh?)

(Also, will this make the data in the better data model more right? Of course not, but it should make it significantly easier to get right in the first place, and to correct when it gets out of sync. If I’ve got to fix bad data, I’d rather do it once and not even need to have it propogate throughout the system.)

I haven’t always been an enormous HANA fan (especially when it is talked about at the SAPPHIRE – It is what we thought it was) but I do think it actually gives us a unique opportunity to not only change the way we design an application ecosystem, but also to step back and design it correctly. Now that could, if even partially-realized, be a game-changer. So thank you HANA, for totally removing the biggest barrier-to-entry for having a truly elegant transactional data model.

And I’m sure application developers would love to have the opportunity to make their applications elegant both on the screen, in the code, AND in the database.

Assigned Tags

      16 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Chris Paine
      Chris Paine
      Hi,
      you know, I'd love to take advantage of using a columnar DB in the ABAP code I write, but quite honestly I have no idea how to do that. Have you seen any resources for those poor devs that need to live up to your wonderful standards of adopting this new framework? With HANA having hit a few InnoJams, there must be something out there by now? Any hints 🙂
      Cheers,
      Chris
      PS love your blog style.
      Author's profile photo Former Member
      Former Member
      The beautiful thing is that you shouldn't have to treat this in any special way from any other database. Select From Where and Insert Update Delete should all be the same (although the suntax will probably vary a little). As pointed out on twitter yesterday by some (and documented by others below), the documentation is a little lacking.

      http://www.zdnet.com/blog/howlett/hana-is-hereinnit/3221

      Author's profile photo Bala Prabahar
      Bala Prabahar
      As I understand it, HANA 1.0 doesn't support ABAP. It supports SQL and MDX; however I don't know if it supports DML(Insert, Update and Delete) statements. They(DML statements) shouldn't be supported IMO. You don't want to have two facts, one in ERP and another in HANA 1.0.
      Again as I understand it, you replciate data from other source systems(R/3, BW etc) and then run reports off of HANA 1.0 in "Real, real time". Whether this("real, real time") would create chaos in business world or would add value is something I don't know. or probably it depends.
      Author's profile photo Former Member
      Former Member
      Bala,

      I believe you are correct, HANA 1.0 is mostly about pulling data OUT. However, the data does have to get in there somehow. Going forward the plan is for HANA to be THE application development platform for HANA, so writing to it will have to be supported.

      Author's profile photo Chris Paine
      Chris Paine
      Hi Jamie,
      whether I use (I hope this will come at some point) ABAP or direct SQL or more stored procedure like SQL code in HANA itself, you're right, developers need to rethink about how they code and how they build data models so that they work better with a columnar DBMS.

      From my research (read big G search engine) It _does_ matter how you code - throw a few "select *" statements in there and you're be better off using an indexed table based DBMS, a columnar DBMS will actually take longer to retrieve the data.

      Problem is, I certainly can't find any SAP resources on what makes good coding, or good data structures/design for HANA. Even looking more generally on how best to code SQL for columnar DBMS's there isn't that much.

      So don't go dumping on us poor dev's ya hear! 😉 If someone could please enlighten us to what it means to do good HANA compatible design, I'll start churning it out. (more the the point I'm really interested to learn) Or is it as Hasso mentioned in his keynote at SAPPHIRENOW - still a learning process that we're all going to be dragged into and at the moment we just "don't know".
      Cheers,
      Chris

      Author's profile photo Former Member
      Former Member
      If SAP is really planning on using HANA as both the transactional AND analytical engine of an application it must act as both a columnar AND row-based database. I guess we'll hav to wait and see just how much compromise that means (and how terrible it will be at both).
      Author's profile photo Marilyn Pratt
      Marilyn Pratt
      Well I got the "skinny" on a real HANA implementation this week when I went to a Sustainability Meetup in NYC. Saw something called "RAPTOR".
      I'm in the process of engaging with the development team now to see if I can get a demo to you all.  Stay tuned.
      Author's profile photo Former Member
      Former Member
      This HANA stuff sounds so NON-SAP, it would be more like a 180° turn in the SAP world. For the first time I remember they will make their ecosystem easier and more elegant, removing lots of suddenly obsolete complexities. This all doesn't sound like SAP at all. When can I buy it?
      Author's profile photo Former Member
      Former Member
      You can buy HANA today, but you won't be able to buy the new SAP built on HANA anytime soon. And, to be perfectly frank, SAP will have a lot of legacy systems to support and upgrade paths to manage, so I wouldn't expect their ginormous data model to change anytime soon. I think these opportunities for "elegant databases" will manifest themselves on totally new software built inside and outside of SAP on top of the HANA platform.
      Author's profile photo Former Member
      Former Member
      Hi Jamie,

      Nice work on a snappy, thought provoking blog. Of course, we're all in same boat (AppDev and BI that is), we all want a beautifully designed, elegant and simple data model that works well for all conceivable use cases. Of course we rarely get that, because the real world gets in the way, and ultimately we all have to deliver something that works, even if it doesn't look very pretty.

      In my experience, increases in performance are always matched by equal (or greater) increases in expectations from users. So regardless of the speed of the platform, there will continue to be a requirement to optimise the data model for specific use cases. And no doubt these optimisations will still be considered "terrible" by some.

      The shackles might be coming off, but they've been replaced by an ankle bracelet.  Better definitely, but we're still not completely free.

      Cheers,
      Jon

      Author's profile photo Former Member
      Former Member
      Of course the real-world will always ruin everything, but we should at least strive for something elegant throughout, right?
      Author's profile photo Former Member
      Former Member
      Of course, but if you've got two use cases that demand different data models, then you either design for one (and be "terrible" for the other) or compromise (and be "a bit terrible" for both). As always the choices will come down to the developers and the user communities they serve. Hopefully the performance boost from a platform like HANA is enough to encourage developers to review their design thinking, take advantage of the new opportunities, and start closing the gap.
      Author's profile photo Former Member
      Former Member
      This post was obviously a bit tongue-in-cheek, but I do think that currently app developers are forced to use data models that really suck for reporting, and as technical people who appreciate order and efficiency I believe that once they have a platform responsive enough to allow them elegant data models they will jump on it. Bringing in analytics folks early in the process is a great first step, and I believe it can be successful because I've never seen a great developer who didn't constantly want to learn a better way to do things.
      Author's profile photo Former Member
      Former Member
      That is an impressive introduction about SAP HANA. Hope I will have the chance to meet her in the real implementation projects.

      Regards,
      Andy
      SAP Geek's Blog

      Author's profile photo Former Member
      Former Member
      Hey Jamie a really fascinating article here and your points about data model design for OLTP versus OLAP are well made. 
      If I’ve got your argument right you are hoping that HANA will do away with these problems and leave the BI guys/gals with a “better” base from which to work their magic.

      Now I’m no expert in HANA or indeed what SAP’s plans are in this area, but what I do know, because we have technology in this area, is that the SAP “data model” is something to behold. Truly it is a work of art and of a scale and complexity that puts into a hall of fame that only it can inhabit.

      We spend our lives helping BI developers interpret this OLTP optimised data model (I’ve blogged on this here http://silwoodtechnology.wordpress.com/)

      I guess the reality is that SAP are not going to change this data model in any short order for HANA, but if anybody does know more about this I’d be interested to know.

      Author's profile photo Former Member
      Former Member
      Graham,

      I doubt SAP will be changing their core data model anytime soon, but each new application they build provides an opportunity to not only optimize performance but also the elegance of the system. A couple of very small changes that HANA could enable could save their customers literally millions of dollars to workaround and SAP is definitely clever enough to capture some of that value.