Skip to Content

Hello Folks,

Problem Description:

To load data into SAP HANA using a generic Stored Procedure and handling delta.I can see multiple threads to load the data into tables in HANA and to avoid usage of cursors. Hence writing my inputs on the solution of this problem using “Arrays”.

Note: This blog is intended to address the above mentioned problem. I will be updating this document with additional work arounds on “real-time” issues in coming up weeks.

Scenarios:

Delta Type = FULL:

You might want to do a “Full” update . Means you want to delete the data from the target table and reload the entire data from the source.

Delta Type = DELTA:

You might want to load “Delta” depending on the primary keys of the table i.e Update if the record exists else insert the new record.

Also have a look on this blog which gives a overview on how this procedure can be called from HDBSQL

SAP HANA: Replicating Data into SAP HANA using HDBSQL

Here is the code:


CREATE PROCEDURE DATA_LOAD_USING_ARRAY(IN IN_SOURCE_SCHEMA VARCHAR(10) ,
     IN IN_SOURCE_TABLE VARCHAR(100) ,
     IN IN_TARGET_SCHEMA VARCHAR(10) ,
     IN IN_TARGET_TABLE VARCHAR(100),
     IN IN_DELTA_TYPE VARCHAR(10))
LANGUAGE SQLSCRIPT SQL SECURITY INVOKER AS   i INT;
BEGIN
   DECLARE STRINGARRAY  VARCHAR(20) ARRAY;
   DECLARE TEXT VARCHAR(100000);
   DECLARE rowcount integer;
   DECLARE primary_keys VARCHAR(5000) := '' ;
   DECLARE column_list VARCHAR(30000) := '' ;
   DECLARE DELETE_QUERY VARCHAR(30000) := '' ;
   DECLARE UPSERT_QUERY VARCHAR(30000) := '' ;
   DECLARE INSERT_QUERY VARCHAR(30000) := '' ;
    /* GETTING PRIMARY KEY COLUMNS OF THE TARGET TABLE*/
   pk_tab = SELECT column_name FROM CONSTRAINTS
WHERE table_name = UPPER(:IN_TARGET_TABLE)
AND schema_name = UPPER(:IN_TARGET_SCHEMA) AND is_primary_key = 'TRUE' ;
   STRINGARRAY := ARRAY_AGG(:pk_tab.column_name);
   SELECT count(*) into rowcount from :pk_tab; -- To know the count of columns
   TEXT := :STRINGARRAY[1];
   -- To form the primary key list
   FOR i IN 1 ..:rowcount-1
   DO
   TEXT := :TEXT || ',' || :STRINGARRAY[:i+1];
   END FOR;
   primary_keys := :TEXT;
  -- To handle the column names with "Spaces" in between
   SELECT '"' ||  REPLACE (:primary_keys,',','","') || '"' INTO primary_keys FROM DUMMY;
   /* GETTING COLUMNS OF THE TARGET TABLE*/
    col_tab = SELECT column_name FROM TABLE_COLUMNS
  WHERE table_name = UPPER(:IN_TARGET_TABLE)
  AND schema_name = UPPER(:IN_TARGET_SCHEMA) ;
   STRINGARRAY := ARRAY_AGG(:col_tab.column_name);
   SELECT count(*) into rowcount from :col_tab;
   TEXT := :STRINGARRAY[1];
    -- To form the column-list
   FOR i IN 1 ..:rowcount-1
   DO
   TEXT := :TEXT || ',' || :STRINGARRAY[:i+1];
   END FOR;
   column_list := :TEXT;
 -- To handle the column names with "Spaces" in between
   SELECT '"' ||  REPLACE (:column_list,',','","') || '"' INTO column_list FROM DUMMY;
   IF IN_DELTA_TYPE = 'FULL'
   THEN
   --To delete the data from the target table
   DELETE_QUERY :=
  'DELETE FROM "'||IN_TARGET_SCHEMA||'"."'||IN_TARGET_TABLE||'"' ;
   -- To insert the new load into target table
   INSERT_QUERY :=
  'INSERT INTO "'||IN_TARGET_SCHEMA||'"."'||IN_TARGET_TABLE||'" ('||:column_list||')
   SELECT '||:column_list||' FROM "'||IN_SOURCE_SCHEMA||'"."'||IN_SOURCE_TABLE||'"';
   EXEC (:DELETE_QUERY);
   EXEC (:INSERT_QUERY);
   ELSEIF IN_DELTA_TYPE = 'DELTA'
   THEN
   UPSERT_QUERY :=
  'UPSERT '||IN_TARGET_SCHEMA||'.'||IN_TARGET_TABLE||' SELECT '||:column_list||'
  FROM '||IN_SOURCE_SCHEMA||'.'||IN_SOURCE_TABLE||'';
   EXEC (:UPSERT_QUERY);
   END IF;
END;







Sample Call Statement:


CALL DATA_LOAD_USING_ARRAY ('KRISHNA','EMPLOYEE','SYSTEM','EMPLOYEE','FULL');





Error handling and maintaining data consistency:

Please find the details in my other blog on how can you achieve consistency along with error handling:

SAP HANA: Workaround for using Commit & Rollback in exception handling block of a Stored Procedure

For Feedback on whether the procedure is success or failure and on how to control the “Commits” you can the relevant information in the above mentioned blog.

Hence i didn’t add those details into this document to make it more specific to the title. ( if required based on the comments will add that too )

As mentioned at the start i will be updating the above code whenever possible.


In the mean time i would love to have your comments and if there is anyother better way of doing it.

Yours,

Krishna Tangudu

To report this post you need to login first.

24 Comments

You must be Logged on to comment or reply to a post.

  1. Akshay Nayak

    Hi Krishna,

    Wonderful blog. But i have some questions here,

    Why did you use ARRAY_AGG and UNNEST function to get the count of columns in line 20 and 21? You could have directly used,

    SELECT count(*) into rowcount from :pk_tab right?

    Thanks,

    Akshay

    (0) 
      1. Akshay Nayak

        Hi Krishna,

        One more question to ask here. I suppose we can also implement this by using cursors instead of arrays. Which one do you think will be optimal ?

        Thanks,

        Akshay

        (0) 
        1. Krishna Tangudu Post author

          Hi Akshay,

          Hmmm that is one question which i have in my mind. The focus of this document initially was to avoid cursors. Which was achieved using Arrays as mentioned above and even table variables.

          Now which is the best option (w.r.t performance ) and what has to be followed is what even am looking out for (hoping that some one would comment here with their experience )

          Regards,

          Krishna Tangudu

          (0) 
  2. Raj Kumar Salla

    Krishna,

    Not got the opportunity to test your code. I would like to know how will the performance for table having bulk data. Any add-on on this will be really useful. Keep up the good work.

    Regards

    Raj

    (0) 
    1. Krishna Tangudu Post author

      Hi Raj,

      Can you tell me on what volumes are speaking of to test for?  So that i will test accordingly.

      Have tested it with records upto 70 million records for one table.

      Regards,

      Krishna Tangudu

      (0) 
  3. Hartosh Singh Bugra

    Hi Krishna ,

       I am getting error at line Number 22. The Error is Feature is not supported in HANA.

       While debugging I have seen that there is nothing in STRINGARRAY.

    Regards,

    Hartosh

    (0) 
      1. miyan aleem

        Hi  krishna,

        I am  having one problem, In my use  case I have to write 3 conditioned query in procedure based on session_user. whenever I write the query in procedure and call in xsodata file I am getting the error like Invalid procedure or parameter list.

        Thanks

        (0) 
  4. Hartosh Singh Bugra

    CREATE PROCEDURE “DSS”.sortMergeJoin_Arrays() LANGUAGE SQLSCRIPT AS

    BEGIN

          declare rstart bigint;

          declare rstop bigint;

          declare lstart bigint;

          declare lstop bigint;

          declare STARTS_ARR BIGINT ARRAY;

          declare ENDS_ARR BIGINT ARRAY;

          declare START_ARR BIGINT ARRAY;

          declare STOP_ARR BIGINT ARRAY;

          declare rlen bigint;

          declare llen bigint;

          declare li bigint;

          declare ri bigint;

         

        

          rightTable = SELECT “STARTS” FROM “DSS”.REPEATMASK2;

          leftTable  = SELECT to_bigint(“START”) as “START” FROM “DSS”.CLINVAR2;

         

          STARTS_ARR := ARRAY_AGG(:rightTable.STARTS);

          START_ARR := ARRAY_AGG(:leftTable.”START”);

          select count(“STARTS”) into rlen from :rightTable;

          select count(“START”) into llen from :leftTable;

             res = select ” as lstart , ” as lstop from dummy where 1<>1;

             li := 0;

             ri := 0;

    LOOP

    if(:ri > :rlen ) OR ( :li > :llen )  then

    break;

    end if;

    lstart := :START_ARR[:li];

    rstart := :STARTS_ARR[:ri];

    if (:rstart <= :lstart) and (:rstop >= :lstart)

    then

    res = (select * from :res)  union (select :lstart as lstart , :lstop as lstop from dummy);

    li := :li + 1;

    continue;

    end if;

    if (:rstart > :lstart)

    then

    li := :li + 1;

    continue;

    end if;

    if (:rstop < :lstart)

    then

    ri := :ri + 1;

    end if;

     

          

            

    end LOOP;

    select * from :res;

    END;

    (0) 
    1. Bhupender Singh Rathee

        Hi Hartosh,

      I worked with your problem, I think Mistake which you was doing is that you are giving index to 0 in below statement

      rstart := :STARTS_ARR[:ri];

      Where as index start from 1 to 2^31.


      if there are chances that table does not contain data its better to handle that situation in try and catch.


      try changing index it will work.

      Hope this will help you



      Thanks

      Bhupender

      (0) 
  5. Hartosh Singh Bugra

    When I execute this call statement from SQL Script Console inside HANA Studio i get error Arithmetic exception at line lstart := :START_ARR[:li];  I think this is because START_ARR is empty or can not be accessed like this. i have seen there is no value in START_ARR in debugging mode.

    (0) 
  6. Nagababu Tubati

    Hi Krishna,

       good effort.

       I have a doubt here.

       What is the use of  “primary_keys” variable in your logic ?

        You maintained this variable but used nowhere in the rest of the code

    Thanks and Regards

    Nagababu

    (0) 
  7. pradeep T

    Hi Krishna ,

    Iam facing a typical performance related issue with using a cursor in a stored procedure .

    Well the business logic is such that it warrants the use of a curosor .

    I will try to explain the best i can of the issue and hoping that you can help me with some pointers .

    Business requirement is to :

    1) Compare table A which has expenses

    2) Compare table B which has sales

    3) Compare expense and sales belonging to similair batch and then generated few expenses

    ———–

    The logic is implemented as follows :

    Cursor on expenses

         cursor on sales

            process the expense + sales combination to generate additional expense records

            then insert data into a result table

        close cursor sales

    close cursor  expenses .

    ———–

    The above process is taking lot of time in the envionment to process even a set of 2k records .

    Any pointer on how to make the code work faster .

    Avoiding usage of cursors seems to be unavoidable at this point .

    Any pointers will help ?? .

    (0) 

Leave a Reply