Skip to Content

Just the other day, one of our developers was having problems getting our California AWS instance back up and running. The day before, we started moving our DW flat files to the instance using the SFTP plug-in. Needless to say, the stock AWS 120 GB drive configuration wasn’t large enough to accommodation all the data files and the HANA database. What made this “strange” is that we have another instance in Oregon with the same flat files and database.

What’s the difference you ask – simple – with our California instance, the tables are row based. In Oregon, the tables are column based! A quick check of the tables showed that we were getting a 12.5 times compression with the column tables – that makes a big difference. We were trying to evaluate query performance of row vs column based tables in similar environments – that will be a topic for another blog.

Unfortunately, our shop is mainly a Windows Server shop and we’re not used to the Linux command line tools. We were also in a state where the HANA instance would not load and HANA Studio could only run in diagnostics mode – that is – you can’t see the alerts.

After some digging around – we learned about the df -lh command. Sure enough the /dev/md0 device with houses the /sap home directory was 100% full. After deleting our flat files – we got back down to 74% usage, so after stopping and starting the HANA instance using Linux – see http://scn.sap.com/community/developer-center/hana/blog/2012/11/26/stop-and-start-rules-for-hana-on-aws everything started up great.

In HANA Studio, we brought up the Administration page and sure enough in the Alerts page, there was the HIGH priority disk full alert. After following the steps in the Alert Details dialog and then going back to the Over view page and clicking on the Disk Full Events link and clearing the events, our instance is in a happy state.

We then went back to the Administration guide http://help.sap.com/hana/hana_admin_en.pdf and learned how to setup alerts – starting at page 70.

Lessons learned:

1. ROW tables don’t have any compression and COLUMN tables do – 12.5x with one of our sample data sets.

2. Alerts do matter – pay attention to them. Next trick is to figure out how to configure HANA and AWS to send emails.

3. Learn the basic Linux commands – that will be another blog post too 🙂

If you like this post – please rate it 🙂

You can follow me on twitter at http://twitter.com/billramo

Regards,

Bill Ramos

Data Architect, Advaiya Inc.

To report this post you need to login first.

6 Comments

You must be Logged on to comment or reply to a post.

    1. Bill Ramos Post author

      Hi Vladislav,

          I’m grad you enjoyed the post. The simple answer to the conversion is – use the ALTER TABLE syntax – http://help.sap.com/hana/html/sql_alter_table.html with the <table_conversion_clause>. For example:

      ALTER TABLE orders

          ALTER TYPE COLUMN;

      Here is the full list of options from the help topic

      <table_conversion_clause>

       <table_conversion_clause> ::= [ALTER TYPE] {ROW [THREADS <number_of_threads>] | COLUMN [THREADS <number_of_threads> [BATCH <batch_size>]]} 

      Converts the table storage from ROW to COLUMN or from COLUMN to ROW.

       ROW 

      Converts the table to ROW storage.

       COLUMN 

      Converts the table to COUMN storage.

       THREADS <number_of_threads> <number_of_threads> ::= <unsigned_integer> 

      Specifies how many parallel execution threads should be used for the table conversion. The optimal value for the number of threads is the number of available CPU cores. If THREADS is not provided the default value of the number of CPU cores specified in the indexserver.ini file will be used.

       BATCH <batch_size> <batch_size> ::= <unsigned_integer> 

      Specifies the number of rows to be inserted in a batch. If BATCH is not specified the default value of 2,000,000 will be used. Inserts into column tables will be immediately committed after every <batch_size> records have been inserted. BATCH option can be used only when a table is converted from ROW to COLUMN storage.

      What you really want to know is what types of entities are better implemented as ROW tables versus COLUMN tables. That’s for another blog – stay tuned 🙂

      Regards,

      Bill

      (0) 
  1. Arne Weitzel

    Very intersting about the compression rate. To my understanding row store should be used primarily for OLTP (let’s say ERP and CRM apps) whereas the column store is for BI apps.

    Thanks for the blog!

    (0) 
    1. Bill Ramos Post author

      Hi Arne,

          The determination of using Row versus Column tables is not well understood by the community yet and it will take all of us to pitch in to record our findings. the general rule that I’ve followed with other database technologies like SQL Server and Oracle is to use column tables for the fact tables in your star-schema and row tables for dimension tables. The problem with this approach is that there is a performance penalty going across engines according to Dr Michael Ruter’s May 2012 presentation “Performance Guidelines for Applications using the SAP HANA Database” available in the http://www.experiencesaphana.com/servlet/JiveServlet/downloadBody/2171-102-4-4396/SAPHANA_TechnicalDocumentsV2.zip resource zip file.

          You need to even be aware of which functions are supported within the same engine. One example for a column table query is to not use an implicit conversion. Here is the quoted example from Chaim Bendelac’s may 2012 presentation in the “1000 – Perfromance – Introduction.pdf” located in the above ZIP

      The query

      SELECT * FROM T WHERE date_string < CURRENT_DATE

      is slower than:

      SELECT * FROM T WHERE date_string < TO_CHAR(CURRENT_DATE, ‘YYYYMMDD’)

      … because implicit conversion of date to character is currently not yet handled by the column engine.

      Regards,

      Bill

      (0) 
      1. Arne Weitzel

        Hi Bill

        sorry if my comment comment came across a bit like I-know-it-all. That wasn’t my intention.

        I have experience with Sybase IQ, but so far I didn’t have the opportunity to work with Hana. So I really appreciate your detailed findings about performance and functions supported in the column store. Thanks for all the info!

        Arne

        (0) 
        1. Bill Ramos Post author

          Hi Arne,

              No problem. I certainly don’t have the answers either 🙂 Fact of the matter is – we need to share our results and design findings for the community to benefit. It’s why I try hard to make sure I have references or first hand data to back up my posts.

          Regards,

          Bill

          (0) 

Leave a Reply