Skip to Content

I’ve been developing code in SAP HANA for nearly 3 years now, and at the end of 2013, there is now a huge influx of developers. This is good, because it means that mass adoption is here. But it also means that we see a lot of the same mistakes being made. I did some analysis of the questions asked in HANA development forums and thought I’d pick out the most common mistakes that newbie HANA developers make and give my advice.

The main thing I’ve noted is that every developer comes to SAP HANA with a set of misconceptions about HANA and a set of experience of some other technology. HANA is a little different to other application platforms and you need to change your thinking. I hope this helps you in your journey.

Never use row-based tables

If you are an Oracle, IBM, or Microsoft person then you are institutionalized into thinking that you put OLTP data in the row-store and OLAP in the column-store. This is not true with SAP HANA! You must create all tables in the column store and that can be used for both transactional and analytic scenarios.

HANA does have a row store and you can create row-oriented tables for some very specific scenarios:

– Transient data like queues, where you insert and delete a lot and the data never persists

– Configuration tables which are never joined, where you select individual entire rows

– When you are advised to by a SAP support personnel

But in general, never used oriented tables, especially for OLTP/Transactional scenarios. HANA is optimized to use the column store for combined transactional and analytical scenarios.

Never create indexes

Again if you come from the traditional RDBMS space, you will see that HANA allows the CREATE INDEX command.

However, when you create a table in HANA (and I’m simplifying), it is in fact creating a set of sorted, compressed and linked indexes. As a result, secondary indexes almost never improve performance. In fact, I’ve never come across a scenario where an index improved performance of an analytic query, where a large volume of data is aggregated.

There is one scenario when a secondary index can improve performance: when you have a query which selects a very small amount of data from a very large table (or group of joined tables). In this instance, creating a secondary index on all your sort columns can allow HANA to find the data faster. But this is a very specific situation – the simple advice is, never create indexes.

Don’t use the SAP HANA Modeler Perspective

HANA has 3 developer perspectives, the SAP HANA Systems View, Modeler, and Developer Perspective. Take the time to read the developer guide and setup the Developer perspective. This will bring you the ability to put all your development artifacts including tables, information views, stored procedures, plus OData and HTML artifacts if you need them. You get change management, version management, the ability to test inactive objects, code completion and a bunch of other things.

Please don’t create models directing in the SAP HANA Content Repository any more.

Don’t use SQLScript unless you have to

SAP HANA provides a powerful stored procedure language, but its power is a bad thing for new SAP HANA Developers. It allows you to write very inefficient code which doesn’t parallelize.

Most of the scenarios I see on the forums that developers are coding could be better done with Information Views like Attribute Views, Analytic Views and Calculation Views. And in SAP HANA SP07, Information Views are faster than SQLScript in almost every scenario. Plus, Information Views are easier for others to understand, remodel and change.

There are scenarios where you need SQLScript, but it shouldn’t be viewed as a general-purpose solution to modeling problems.

If you have to use SQLScript, don’t use Cursors or Dynamic SQL

If you are a PL/SQL or T-SQL developer than you will be familiar with Cursors and Dynamic SQL. I see a lot of questions in the forums related to performance problems with these. Avoid using them at all costs – this is detailed in the SQLScript Reference Guide.

There are a number of things that push you out of multi-threaded mode in SQLScript: Local Scalar Variables, Loops, Cursors and Dynamic SQL. All of these constructs will cause you performance problems if you use them, so you need to avoid them.

Especially don’t use Dynamic SQL to generate column names for INSERT or SELECT statements! Instead, create an Information View for SELECT statements, or write a Python-based loader to load tables.

In many cases you can change a loop into a single SELECT or INSERT statement. In other cases, you can nest SQLScript procedures to improve performance. I’m thinking that this area needs a blog of its own.

Avoid JOINs on Large Data Volumes

Joins on very large tables (>100m rows) can be inefficient. If you have two large fact tables then never join them – performance will always be a problem.

Instead, you can normalize the data in Analytic Views and create a Union with Constant Values in a Calculation View. Werner Steyn describes this nicely in his Advanced Data Modeling guide. You can expect a very large performance increate for complex queries by using this mechanism.

Consider use of a second monitor

Most laptop screens are poorly adjusted for use of Eclipse – this isn’t a HANA Studio-specific problem. The best screen is a 27″ 1920×1080 screen like this one from HP. Connect your laptop to a second screen at work for the best development experience. If you have a regular laptop screen that is 1366×768 resolution, you will really struggle to develop well.

Update your HANA environment regularly

This is true for all your HANA components – DB, Studio and Client. There are continuous innovations coming every 6 months with major Service Packs, and smaller revisions every 2 weeks. It’s well worth patching frequently to get the best developer experience. Take a read of this blog for more details.

Final Words

This isn’t designed to be an exhaustive guide, but rather a compilation of the common mistakes that I see developers making when they’re starting out with HANA. If you’re new to HANA then please take the time to read them and think about what it means to your developer methodology.

Have I missed any obvious ones?

To report this post you need to login first.

87 Comments

You must be Logged on to comment or reply to a post.

  1. Tammy Powlas

    Perhaps an obvious one – when modeling in HANA, a big screen may be necessary to see everything.

    This is a similar tip I got from Brian Durning, Dell, who advised on an ASUG webcast to have a large monitor if you are developing in the Information Design Tool.

    (0) 
    1. John Appleby Post author

      Yep it’s not a bad idea. It is a side effect of having to model in Eclipse. At Inside Track this year I had 800×600 which was a disaster.

      Most Eclipse developers in fact use a second monitor.

      (0) 
      1. Gareth Ryan

        If you are using Eclipse/NWDS in any form, its all about resolution, resolution, resolution.

        In my home office I use 24″ and 22″ monitors side by side but they are both set to max resolution (1920×1200 I think) so that I can get everything on screen.

        I managed to get a high-spec’d laptop from my company with a 17″ screen and 1920×1200 resolution for the same reason, although it makes things quite small and a bit of an eye strain!  Some of my colleagues work with NWDS on 15″ laptops and it is beyond painful.

        In short, remember resolution is everything, not just size 😉

        Gareth.

        (0) 
        1. Jon-Paul Boyd

          Anyone working with HANA Studio should have a decent spec PC AND a sizeable monitor.  Quite often project team members working on client site get what they are given – poor client spec PCs with centrino cores and 2GB (4GB if you are lucky), OK for Word but terrible for HANA Studio, with everything from propogating columns up the chain to scrolling through graphical unions taking an age.  Such a false economy.

          (0) 
          1. John Appleby Post author

            JP this is often a problem when you are a consultant/contractor at an end-customer. Customers regularly provide hourly wage workers with poor tools, meaning they take longer than expected to complete tasks, which costs money.

            Sad but true.

            (0) 
  2. Sean Holland

    hi John, looking at some of the ABAP on HANA SAP demos, they predominantly leverage SQLScript for logic pushdown and the taking similar functionality out of the ABAP layer and pushing it down to the db SQLScript procedures. The use of local variables and the mix of CE functions plus common SQL statements for instance is highlighted in this one

    http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/library/uuid/90d39eb4-259d-3010-f5ba-aa4c13b38314?QuickLink=index&overridelayout=true&58385785467697

    Then again, this may be just giving folks an example of what can be done rather than a best practice guide..

    I’m also curious on this comment

    And in SAP HANA SP07, Information Views are faster than SQLScript in almost every scenario


    i would have expected two Calc Views, one graphical and one sql script with equivalent steps would be executed identically (e.g. projections/joins etc)? Is this not the case? Has something specifically changed in SP07 that we can test out?

    Thanks,

    Sean.

    (0) 
    1. John Appleby Post author

      Hi Sean,

      Well, first, I didn’t write this, but rules were meant to be broken. The Suite on HANA guys broke most of those rules 🙂

      If you know what you’re doing with SQLScript then it can be great, and it’s OK to mix SQL and CE Functions up with local scalar variables, if you understand exactly how the code will execute. I don’t recommend it because it requires a pretty advanced knowledge of SQLScript to get right. But programmers from other disciplines start with this rather than using the easier and more reliable information models.

      Ah, your latter point is highly interesting. It is much more subtle than this. Because of Suite on HANA and HANA Live (in SP6 and SP7), SAP focussed a lot of development effort into the Graphical Calc View modeler. It now generates Calculation Plans, which are often more efficient than CE Functions. Just try it out using a join between two big tables and a few fixed filters – you will see that the Graphical Modeler outperforms the Script modeler.

      There are other subtleties like the SP7 COUNT DISTINCT in SQL is pushed into the Calc Engine as bwPop (the lowest level function), but you can’t model this in the Graphical Modeler or CE Functions – only in SQL.

      John

      (0) 
      1. Sean Holland

        Thanks for the quick reply John..

        Do you see more sql statements like COUNT DISTINCT being handled more efficiently by the calc engine and removing the need for the equivalant CE functions? I haven’t seen any new CE functions come out in the last couple of releases.. e.g. A CE_MINUS would have been nice at times 🙂 ..

        We’ve predominatly used SQLScript procedures as wrapper or entry programs for reports, where it’s given us flexibility to handle different input parameters (e.g. you can run a shipping report by a different document type, material number(s) or sales order(s) etc) Previous revisions did not like mixing SQL with CE functions, so we typically kept them seperate in the hope we’d maintain parallel execution for independent portions of the code. But it has been fairly painful trying to get a good profiler/trace tool, PlanViz isn’t readible if it’s anyway complicated or there’s a number of sub steps/components.

        So you say a lot of focus has been in generating efficent calculation plans for graphical calc views? We’re currently waiting to see some of the HANA live models and how useful they may be out of the box.

        Do you see some future convergence between the AT, AN & CV models and the need to specify what type of view you require up front?

        Thanks again for all your useful blogs and for the day that’s in it, Happy New Year!

        (0) 
        1. John Appleby Post author

          Well what I’d like to see (and whilst I have some influence, I don’t determine the product direction) is a convergence to a single model infrastructure based on Core Data Services, which has a graphical and script-based view which are interchangeable and operate the same way.

          CE functions have been left a bit to lay in SP06 and SP07 due to the push to get Suite on HANA working. We need some new ones like CE_COUNTDISTINCT, CE_AVG and CE_WEIGHTEDAVG, as well as those functions being made available in the Analytic & Calc View modelers.

          Yes you’re right that it’s not uncommon to use SQLScript for complex reports, and if you know what you’re doing that’s all good. It’s just a bad place for new developers to start (hence this blog 🙂 )

          You can mix SQL and CE just fine, so long as you don’t move a lot of data around between CE and SQL. But, the statements aren’t optimized together into one Calc Plan.

          Yes, the Graphical Modeler is definitely more efficient than SQLScript in every case I’ve tried in SP7. You just have to look at the PlanViz to see.

          In SP7 the models already largely converged. In my testing, there is in most cases now no need for Attribute Views or Analytic views unless you want them for reusability or supportability. The Calc View modeler can in almost every case process a complex scenario against tables.

          The other major thing in SP7 is that raw SQL is in many cases 100x faster.

          Happy New Year!

          John

          (0) 
      2. Jody Hesch

        Hey John/Sean –

        Just FYI it looks like COUNT(DISTINCT()) is supported in Analytic Views in SP7 (see p. 54 of the latest HANA Modeling Guide – I’ll have to check this once the customer decides to upgrade. Also, it’s been supported for a while as a Counter defined in the output node of graphical Calculation Views.

        Also, with regards to execution of SQL vs CE functions vs graphical Calculation Views – it’s worth pointing out a few things, I think:

        1) Firstly, CE stands for “Calculation Engine”. 🙂   Before the graphical modeler was available, all we had to work with for CalcViews were CE functions.

        2) CE functions and graphical CalcViews both generate execution plans – and in fact, these plans are “instantiated” with specific optimizations depending on the client query. For example, only those fields that are requested are processed, meaning HANA can leverage field pruning and join pruning in many cases to remove unnecessary work when possible. (There’s a setting to change this to maintain more “relational” processing, I believe I recall Ravi pointing this out in his blog on Exception Aggregation, leveraging the “Keep Flag” feature).

        3) The nature of optimization differs between SQL-based SQLScript and Calc Engine operators (whether it be graphical or CE functions). As such, there is now an option (since SP5?) to have graphical CalcViews execute in the “SQL Engine”. (You’ll see this in the semantic node). Leveraging this option *may* give better performance, but may also give different results (since the CalcEngine itself does not operate in a strictly *relational* manner given optimizations like field pruning referenced above). 

        4) As John noted (I can’t get these darn hash tags to work), lots of functionality (and it seems optimizations) are falling by the wayside with respect to CE functions, given even more reason to stick with graphical CalcViews.

        5) Also, as John noted, the two optimizers don’t mix well regarding SQL and CE functions. There are times when mixing the two is required – but if it can be avoided, I’d recommend it, as performance can drastically suffer.

        Also, Sean (5 points to whoever can remedy my incompetence with hash tags) – while there isn’t a specific CE function for other set operations like MINUS and INTERSECT, there are clever ways to implement such functionality in Calculation Views. Here’s an example I did with CE functions that could more easily be done graphically.

        (0) 
        1. Jody Hesch

          Also, just to throw a bone out there –

          Folks curious about CalcEngine Execution Plans for Calculation Views may be interested in poking around System Views:

          M_CE_CALCSCENARIOS

          M_CE_CALCSCENARIOS_OVERVIEW

          M_CE_CALCVIEW_DEPENDENCIES

          M_CE_DEBUG_INFOS

          M_CE_DEBUG_JSONS

          M_CE_DEBUG_NODE_MAPPING

          M_CE_PLE_CALCSCENARIOS

          All of these are documented in the SAP HANA SQL and System Views Reference. For example, under M_CE_DEBUG_JSONS there’s a column “TYPE” that describes whether a particular scenario is “original, instantiated, optimized, extrace”.

          In theory, one is able to extract the CalcScenario JSON for a graphical CalcView, compare to a CalcView with CE functions, and find out where the differences. In practice, I have no idea how far one can go with this. I myself haven’t yet nerded out on the CalcView System Views above… 🙂  

          (0) 
        2. Sean Holland

          Thanks for the note back Jody.

          I took a quick look at the post on the different approaches for achieving MINUS set operator functionality and recognize that we implemented the CE left outer join in our sql script procedures, but thanks for showing the alternative and faster CE UNION & AGGREGATION approach.

          I’d always be interested in the internal workings on what engine is being hit and where the time is going for each approach, I know Time is not always the best indicator for performance or efficient execution and there is PlanViz & tracing tools available, we plan to get on SPS07 shortly, so didn’t want to spend too much time doing analysis on the older revision with all the changes/improvements that have been made in the latest release.

          Maybe there’s already a blog out there that has a more detailed explanation of the execution paths for similar sql/sql script/calc views? If not, any volunteers out there?

          Thanks again,

          Sean.

          (0) 
          1. Jody Hesch

            Hey Sean,

            Sure thing. Also, I’d highly recommend becoming familiar with VizPlan and other performance analysis tools and techniques (i.e. tracing) found in the Performance Guide at help.sap.com/hana_appliance. I’m sure they’ll continue to evolve but they won’t be going away. In addition to the official SAP guide you may also be interested in checking out blogs and documents put together here on SCN – for example, this one and this one by Lars Breddemann.

            (0) 
  3. Antony Jerald J

    Hi John,

    Thanks for sharing with us!!!  It’s really helpful.  Could you please give me clarify if you dont mind on the following?

    In hana what are all the tables are in column store and in row store?  If we talk about BW on hana, which tables(for eg; PSA, changelog tables, etc.,)leverages column store and row store features?

    I learned that in Hana most database operation corresponding to read will happen in column store and for tables which requires write, uses column store.  Correct me if I’m wrong.

    Kindly give me clarity on my above queries which makes me confused often :-(.

    Regards,

    Antony Jerald.

    (0) 
    1. John Appleby Post author

      For BW and Suite on HANA scenarios, configuration tables and qRFC queue tables are in the row store – just the same as I describe above. So tables like RS*, DD*. All master data and transactional data is columnar oriented.

      So PSA, DSO, Cube, Master Data – all column oriented.

      The choice of row store or column store has nothing to do with reads or writes, it has to do with the column store being inefficient for a few specific scenarios like configuration tables and queues.

      John

      (0) 
      1. Antony Jerald J

        Thanks a lot John 🙂

        So whether it’s a read or write operation it happens in column store except qRFC and configuration tables!!!!

        One more query:  How do we distinguish between row store tables and column store tables?  (that is, by looking at any table, how can we come to a conclusion that it’s a row store table or column store table?)

        Is there any specific settings which shows its a table of type column store or row store?  If so where can we find that in hana studio?

        Kindly suggest.

        Regards,

        Antony Jerald.

        (0) 
        1. John Appleby Post author

          You specify when you create the table

          CREATE TABLE

          CREATE COLUMN TABLE

          In HANA studio open the table definition. It says ROW or COLUMN on the top right

          (0) 
  4. Ashok Babu Kumili

    Hello John Appleby,

    Perfect document to understand the Pros and Cons of using variuos options in “SAP Hana Projects”.  Very nicely written articles. Thanks for the examples used.. I learnt the key views. It has been summarized very nicely

    (0) 
  5. Jelena Perfiljeva

    Thank you for sharing, John, really liked this summary. As the developers come from all kinds of backgrounds, it’s important to remember that some “development baggage” may need to be left behind. Could’ve never imagined life without secondary indexes. 🙂

    (0) 
  6. Justin Molenaur

    Cannot agree more with the new monitor enough…

    Also – I feel like the developer perspective is not widely known/explained to this point. I know when I went to a HANA code jam earlier this year it was shown to me and I felt like it was a foreign application! Now it makes a little more sense however.

    Regards,

    Justin

    (0) 
    1. John Appleby Post author

      Very true. In SP06 the developer perspective was also not well integrated. In SP07 it is 100x easier to use.

      They really need to get rid of the Modeler perspective and enforce use of the Developer perspective. I suspect this will happen through SP08 and SP09.

      (0) 
  7. Brenton O'Callaghan

    As usual – very good post John!

    I would add one more which is slightly more on the logistics perspective – keep your development environment up to date. The team are developing HANA studio constantly so any feedback, bugs or general improvements are released pretty regularly so its worth checking for a new version of the studio every now and again.

    (0) 
      1. John Appleby Post author

        Yes, you can use a newer Studio on an older database, provided they are the same SP level.

        This is documented somewhere, but I can’t find where!

        (0) 
        1. Justin Molenaur

          Confirmed on Johns comment, but I am also using Studio rev70 with HDB rev62 and it is working great even across SP levels. Previously was on Studio rev 68 against HDB 62 and it was fine as well.

          Additionally in the jump from 68 to 70 for Studio, I am seeing great performance and GUI enhancements.

          Regards,

          Justin

          (0) 
          1. John Appleby Post author

            Yes, you can indeed use Rev.70 against SP6 based systems. If you use it against SP5 based systems you will get problems. I use the rule of thumb of using the same SP.

            Indeed the Rev.70 Studio has a massive focus on performance especially for XS artifacts.

            (0) 
            1. Darryl Griffiths

              Is there a compatibility guide in the same way that SAP GUI is backwards compatible?

              Or are people setting themselves up for issues if they decide that it’s OK to run a later version, then “boom” it doesn’t work for a specific release.

              I’m only asking because I feel that this is a pretty complex piece of kit and if it breaks because of something that a newer Studio version has done, then SAP can simply say, “you shouldn’t be running that version of the Studio” and the developer(s) will be looking a little flushed.

              I feel it needs some rules regarding the compatibility.

              Something to help ensure that people get the best experience, an the safest use.

              Are SAP not concerned with this too?  Especially now that future ABAP development will be using a similar technological setup.

              (0) 
  8. sumit bisht

    Hi John!

    Thanks for providing the information.

    A word on eclipse studio – while working on Linux 64 bit, there is not much of a scarcity of screen estate and by reducing the font size, more information is easily displayable.

    Apart from occasional freezes, I found this platform more easy to install – considering the fact that both studio & client are built in as a single software.

    Cheers!

    (0) 
  9. Phani Rayasam

    Golden rules are really awesome.

    These are valuable  findings everyone keep in mind  for successful execution of the project.

    Thanks,

    Phani Rayasam

    (0) 
  10. Krishna Tangudu

    Hello John Appleby

    Regarding “Especially don’t use Dynamic SQL to generate column names for INSERT or SELECT statements! Instead, create an Information View for SELECT statements, or write a Python-based loader to load tables.

    We are using  cursor to get the column names dynamically using a table named “TABLE_COLUMNS” so that you need not mention all the column names but instead you can get the dynamically based on the input of “Table Name”.

    And then executing the statement using “EXEC”.

    Can you help me with an alternate way for this John as you are mentioning?

    Awaiting your response.

    Regards,

    Krishna Tangudu

    (0) 
    1. Krishna Tangudu

      Adding to the above comment:

      Have tried to avoid cursors by using Arrays as mentioned in the below blog:

      SAP HANA: Generic Procedure using Arrays to Load delta data into Tables

      But i was wondering as mentioned in this blog:

      Especially don’t use Dynamic SQL to generate column names for INSERT or SELECT statements! Instead, create an Information View for SELECT statements, or write a Python-based loader to load tables.


      Can you let me know what do you mean by creating information Views? Does it mean to create “Column Views” for all the different possible tables i have and call them when i need to frame the insert or select statement?


      Regards,

      Krishna Tangudu

      (0) 
  11. Andres Rodriguez Guapacha

    Hi. My background is mainly in ABAP, but I am learning HANA and BI concepts… now it seems that I have a mess in my head.

    If I am developing an application with both transactional (OLTP) and analytical (OLAP) parts, how do I model my database? I know I will use the column-store, but do I use an normalised model or a dimensional model? Or is the table model normalised but then the dimensional model is created using views?

    Hehehe.. thanks for any help.    

    (0) 
    1. John Appleby Post author

      That is an EXCELLENT question. You use an Entity Relationship Model. Each entity (customer, sales order header, sales order line item, product) should exist as a table. They can be inserted into as such, and each piece of master data should have a primary key.

      Then you build HANA Information Views on top, in which provide you join your entities into one unified virtual OLAP View.

      SAP River makes this very easy to achieve in a semi-automated fashion.

      (0) 
  12. Suseelan Hari

    Hi John,

    Oh is it! I happy to know that you are developing SAP HANA for 3 years. Thats great. If I have any SAP HANA queries definitely I will contact you for your help. You have mentioned about the best screen and it should be 27″ 1920×1080 screen like this one from HP. Whether you prefer any other screen other than HP? If yes, please let me know. I am raising this question to understand about that. Thanks for sharing 6 golden rules of SAP HANA.

    Regards,

    Hari Suseelan

    (0) 
    1. Ravindra Channe

      Hi Hari,

      I am fully assured that all your HANA queries will be addressed by John. He has great knowledge on HANA which is quite evident from his valuable contribution to this forum.

      Regarding the screen size, I think bigger the better. But if you cannot have an external desktop, then minimizing all other panes like Properties, Navigation pane, logs etc will help gain some space and keeping only the active working pane on the desktop will be useful.

      This is what I adopt as my work place has flexible seating and I would hate to carry external desktop everyday home after work and bring it back next day. 🙂

      Regards,

      Ravi

      (0) 
    2. John Appleby Post author

      I’m vendor-agnostic, actually I use a 27″ iMac, so any good screen will do. In the SAP office there are some very nice 27″ Fujitsu screens at hot desks.

      Ravi – I’m the same way, I use a 15″ MacBook Pro as my primary system and often minimize panes. Good pro tip!

      (0) 
  13. Ravindra Channe

    I’ve been developing code in SAP HANA for nearly 3 years now

    🙂 , give me an year and I will be there with you (provided you stop working on HANA right now. 😉 ).

    Very informative and as always, the discussion on the blog is equally valuable.

    Regards,

    Ravi

    (0) 
    1. John Appleby Post author

      Ha right, like I’m standing still 😉

      On a serious note I don’t pretend to be more knowledgeable than many of the awesome HANA people kicking around, especially the awesome HANA Distinguished Engineers.

      (0) 
    2. Mirthu Devaraj

      Hi Ravi,

          

           I am new to SAP Hana.. I understand in HANA there is no ISNUMERIC().. Could you please assist me with User defined  function that is euivalent to ISNUMERIC()

      Thanks

      Mirthu

      (0) 
      1. Gareth Ryan

        Hi Mirthu,

        You need to open a new discussion of your own, rather than asking questions here in the comments of a blog post.

        Cheers,

        Gareth.

        (0) 
  14. Manoj Kumar Reddy Akkamreddygari

    Hi John,

        Thanks for a very good information.

        I am working on migration of SQL database to SAP Hana. The insert, delete and update procedures are slow, may be 6 times slower than procedures written in T-SQL.The insert and delete procedures contains simple insert and delete statements. I have tried with many options like partioning of column tables, delta merge but nothing helped me to increase the performance of DML statements in SAP HANA. I am not able to identify where we have to make necessary changes so that we can increase performance in SAP HANA. Can you pleae help me out in this.

    Regards,

    Manoj

    (0) 
  15. Sergey Shablykin

    Hi, John! Thank you for sharing your experience!

    There is a note 0001719282 which recommend to use partitioned tables to control that partition size will not be more than 250 mln records.This is much more than you 100 mln recs.

    Only one question: does you rule about table size and joins is relevant for partitioned tables?

    best regards,

    Sergey

    (0) 
    1. John Appleby Post author

      This is a very application-specific note and honestly, I don’t fully agree with it 🙂

      Also I didn’t say partitions shouldn’t be >100m, but rather that you should avoid joining on more than 100m rows in one step. This means a results set of >100m rows that is materialized.

      More to the point, TLOG flattens the data in HANA so joins are not required. It is a very specific application, so please don’t take your design guidelines from it.

      (0) 
  16. Tulika Adhikary

    Hi John
      Its really a gud article on HANA performance. Need one small suggestion.. I am quit new to HANA and looking for some book oriented on HANA for ABAP technical consultants. Can you pls help me on this 

    (0) 
  17. Pedro Cardoso

    Great article John.  Glad I came across this – very interesting and valuable insights.


    On the row store vs. column store guidance, if I can validate with you one use case I had landed on that favored row-store tables…it seems like you made an inference that supports my thinking but I would like to make sure.


    I have a requirement where I will be staging a fairly significant amount of data (>300M row tables) in HANA and be using a series of complex Data Services data flows that will pull data in from multiple (column) store tables and subsequently perform a series of transformations (inserts/updates) into the staging area in order to build a number of target tables that will ultimately get copied into proper column-store tables.


    The thinking I had was that for the ‘staging area’, which is transient in nature, and where join/update activity is going to be intensive, that the row based storage would result in better performance, as I would be able to benefit from secondary indexes and the “updates” wouldn’t be as expensive in a row-based table compared to column-based.


    Thoughts? Also wondering what type of performance I should expect….am thinking it will be less than optimal (from a HANA perspective), but orders of magnitude faster than if this “data transformation” was occurring back in a traditional RDBMS like Oracle or MSSQL…


    Thanks in advance..


    Pedro

    (0) 
    1. John Appleby Post author

      Well this is easy to test both ways. Just change the DDL to remove COLUMN.

      But the column store should perform much better and use much less RAM. Try it out and see for yourself!

      (0) 
  18. Ning Tong

    Hello John,

    A wonderful guide also for HANA performance tuning.

    So at most case, there are only indexes created by HANA on primary keys of tables?

    Thanks & Best Regards,

    Tong Ning

    (0) 
    1. John Appleby Post author

      Yes, in most cases that’s right. Also there are helper indexes created on join predicates of information models.

      And finally, in OLAP scenarios, it can be useful to define secondary indexes on multiple columns. This provides a sorted index which allows faster location of individual records. But it does not help for aggregations.

      (0) 
  19. David Paul Dornseifer

    How true, agree on all of your points especially the cursor and performance related stuff. The sad thing is just that there is sometimes no alternative, e.g. if you have to run a proc over a dynamic result set. Hope this issue will be addressed by a little bit more complete array implementation in SP09

    (0) 
  20. P B

    Thanks John for sharing this useful info. Can you share some document on version control in HANA? We have env where multiple developers are working in same code. How can implement version control functionality using SAP HANA inbuilt feature so there is no conflict of code. Really appreciate your response on this.

    (0) 
  21. Markus Zoeller

    Hi John,

    do you have a recommendation or guideline how to identify potential unneeded “secondary indices” under HANA in order to improve performance and reduce data footprint ?

    Thanks and regards

    Markus

    (0) 
  22. Markus Zoeller

    Hi John,

    the link for “Advanced Data Modelling” does not work any more. Could you please sent the link or make an update on it ? I would be interested in this because in SAP Bank Analyzer we specially often face this issue.

    Thanks and regards

    Markus

    (0) 

Leave a Reply