Skip to Content

recently I have posted the workflow analysis with SAP Lumira: Workflow analysis with SAP Lumira [Data Geek Challenge]

Analysis has been done for about 375,000 data rows.

So I decided to run the analysis for the complete dataset – about 21,000,000 (twenty one million) rows.

Analysis done on Windows 7 based PC with core i7 2600K (2nd gen), 8GB RAM and SSD Disk.

W7HanaDev-2013-08-08-17-19-38_crop.png

Exported from HANA csv file had a size of almost 3,5 GB!!!

W7HanaDev-2013-08-08-17-19-55_crop.png

Memory state after windows start

W7HanaDev-2013-08-08-17-20-10.png

Memory state after Lumira load

W7HanaDev-2013-08-08-17-20-27.png

Memory state after analysis

W7HanaDev-2013-08-08-17-37-05.png

So… less than 2 GB RAM used for 3,5 csv file analysis. PC with 4+ GB RAM can be used for analysis of huge amount of data. 

Responce times are acceptable you can see it on the following screencast (unfortunatelly without sound).

Best Regards,

K

P.S.:  English language is not my native language, and any person is not insured from mistakes and typing errors. If you have found an error in the text, please let me know – I’ll correct the post.

P.P.S.: If you have some ideas, how to correct/improve this post – please don’t hesitate to leave a comment.

To report this post you need to login first.

3 Comments

You must be Logged on to comment or reply to a post.

  1. Henrique Pinto

    Why not use HANA Online in this case?

    I understand this was just a benchmark test, the 21 mio rows would have been transported anyway. It’d be interesting to know whether it would consume the asme 2GB of RAM in the HANA Online scenario.

    (0) 
    1. Konstantin Anikeev Post author

      Hi Henrique,

      thanks for your remark. I was waiting for such type of questions, cause it is reasonable to use HANA, if available, for this purpose.

      Please check my first post Workflow analysis with SAP Lumira [Data Geek Challenge].

      Unfortunatelly, I’m not able to transfer gigabytes of data via cloud. Upload script for HANA was about 100GB by me. After data transformation csv was about 3,5 GB.

      Same analysis could be done direct in HANA, but we do not use it in production environment. So I used 30-day trial edition of HANA to locally convert multidimentional data into raw table and prepare it for analysis with SAP Lumira.

      Another issue – sometimes it is enought to send anonymized data for analysis instead of providing access to live data.

      Best Regards

      Konstantin

      (0) 

Leave a Reply