Skip to Content
Based on revision 45 (SPS 5) of SAP HANA Studio and Database

One thing I found a bit annoying with HANA had been the fact that it was not possible to go and change table definitions later on as easily as I was used to from other DBMS.

The change I wanted to implement was to add a NOT NULL constraint to a table already containing data.

Up to Revision 41 (the last SPS 4 revision) this was what happened:

Revision 41 – you know this old, outdated stuff… ๐Ÿ˜‰

  drop table aaa;
    create column table aaa (cola nvarchar(30) not null ,
                             colb integer ,
                             flag varchar(1) default ' ');
    insert into aaa values ('A', NULL, NULL);
    insert into aaa values ('A', 10, NULL);
    select * from aaa;
    COLA    COLB    FLAG
    A       NULL    NULL
    A       10      NULL

We’ve got a table, we’ve got data in it and there are NULL values in our table.

Let’s do something about this!.

    alter table "AAA" alter ("COLB" INTEGER NOT null);
    Could not execute 'alter table "AAA" alter ("COLB" INTEGER NOT null)' in 25 ms 571 µs Started: 2012-12-13 12:23:43.
    SAP DBTech JDBC: [7] (at 25):
    feature not supported: NULL value exists: COLB: line 1 col 26 (at pos 25) 

Fair enough, there are NULL values in there, so I have to get rid of them before enabling the new constraint.

Let’s do that:

    update aaa set colb=0 where colb is null;
    select * from aaa;
    COLA    COLB    FLAG
    A       0       NULL
    A       10      NULL

Now, let’s retry to enable the constraint:

    alter table "AAA" alter ("COLB" INTEGER NOT null);
    Could not execute 'alter table "AAA" alter ("COLB" INTEGER NOT null)' in 29 ms 68 µs Started: 2012-12-13 12:24:39.
    SAP DBTech JDBC: [7]: feature not supported: table must be empty to add NOT NULL column

“Feature not supported” Bummer!

Of course, at this point I could still create a new table with the constraint in place, copy over the data and the drop the original table.

But who on earth would want that?

Not me (so I opened a support message, asking for this feature to become supported…)!

Now we’re on SPS 5 (revision 45), everything is shiny and new, right?

Let’s see what happened to my request…

    drop table aaa;
    create column table aaa (cola nvarchar(30) not null ,
                             colb integer ,
                             flag varchar(1) default ' ');
    insert into aaa values ('A', NULL, NULL);
    insert into aaa values ('A', 10, NULL);
    select * from aaa;
     COLA    COLB    FLAG
    A       NULL    NULL
    A       10      NULL
   alter table "AAA" alter ("COLB" INTEGER NOT null);
   Could not execute 'alter table "LARS"."AAA" alter ("COLB" INTEGER not null)' in 44 ms 528 µs Started: 2012-12-13 12:27:17.
   SAP DBTech JDBC: [7]: feature not supported: NULL value exists: COLB: line 1 col 33 (at pos 32) 

Up to here no changes (good thing!) …

   update aaa set colb=0 where colb is null;
    select * from aaa;
    COLA    COLB    FLAG
    A       10      NULL
    A       0       NULL
   alter table "LARS"."AAA" alter ("COLB" INTEGER not null);
   Statement 'alter table "AAA" alter ("COLB" INTEGER NOT null)' successfully executed in 2.268 seconds
   Started: 2012-12-13 12:28:12 (server processing time: 2.245 seconds) - Rows Affected: 0 

This is A-amazing, isn’t it?

Really nice to now and have, especially if you finally decide to clean up your data model and get all the required constraints straight, that are required to keep data as clean as possible.

One more thing on NULLs…

One thing that comes play, when you actually DO allow NULLs in your database: how do you SORT the NULL value?

If your table contains NULLs in a column and you want to sort by this column, where would you want to see these NULLs?

At the end of the list? At the top?

Up to SPS4 there was just the default sort order available:

NULLs were dealt as if they were smaller than the smallest value of the column.

So ‘naturally’ the place of the NULLs would be the top of the list (again, default order being ASCending).

While this is usually OK-ish if you know about it, there are for sure as much reasons to put NULLs at the end of the list as there are to put them to the top.

Starting with SPS5 you have the choice with that (see the docs here):

    insert into aaa values ('C', 5, 'F');
    insert into aaa values ('C', 5, NULL);
    select * from aaa;
    COLA    COLB    FLAG
    A       10      NULL
    A       0       NULL
    B       10      NULL
    C       5       F  
    C       5       NULL

NULL first please:

    select cola, colb, flag from aaa
    order by flag NULLS FIRST;
    COLA    COLB    FLAG
    A       10      NULL
    A       0       NULL
    B       10      NULL
    C       5       NULL
    C       5       F   

The other way ’round:

    select cola, colb, flag from aaa
    order by flag NULLS LAST;
    COLA    COLB    FLAG
    C       5       F  
    A       10      NULL
    A       0       NULL
    B       10      NULL
    C       5       NULL  

Attention though when counting/summing/aggregating!

    select cola, count(flag) from aaa
    group by cola
    order by count(flag) NULLS LAST;
    COLA    COUNT(FLAG)
    A       0            <<< NULLS treated as 0's (zeros)
    B       0         
    C       1         

Once again, that’s all folks!

Cheers,

Lars

To report this post you need to login first.

4 Comments

You must be Logged on to comment or reply to a post.

  1. Stefan Koehler

    Wooohoo – what a killer feature ๐Ÿ˜› (sorry i could not resist)

    The more interesting question is .. how is this implemented? In a table with only one data set its pretty fast of course, but does it store the information in the DDIC and updates it when required or does it run through the whole table by DDL?

    I assume that the DBSL behaves the same way on HANA as on every other RDBMS with a default value and a NOT NULL constraint, right? .. so in reality it does not only add a NOT NULL constraint ..

    Regards

    Stefan

    P.S.: Lars .. you know how the big guys do it ๐Ÿ˜‰

    (0) 
    1. Lars Breddemann Post author

      Hi Stefan,

      glad to see that you’re still peeking over the high red fence ๐Ÿ˜‰ .

      To be honest with you: I didn’t have the chance yet to try out the DB utility in NetWeaver on HANA.

      But I would agree with your guess … (OK, this one goes to my TOCHECK list…).

      Concerning the runtime on larger datasets:

      It depends is again the answer.

      As long as we’re working on a column table AND this column table is completely merged, then the check for NULLs is just a value dictionary operation. Very quick, independent of the actual size of the table.

      Once there is data in the delta log, things become more complicated, but also there we’ve got a dictionary to check for NULLs, so nothing too critical there either.

      If I think about it, I’d guess that the largest effort when dealing with a huge delta log will go into figuring out, whether the rows that actually contain NULL values are still valid…

      Having written that, I did not check how this behaves on history tables.

      Based on what I currently now, it could be that the table contents at point in time A does not confirm the constraints introduced at time B (A<B), which of course is completely OK. However, how this is handled by HANA is something I can’t say right now.

      (Do you know how recently added NOT NULL constraints affect optimizer assumptions when performing flashback queries in your favorite DBMS?)

      Finally: if the table is a rowstore table a full scan of the table is required to check for the NULL values when switching from NULLABLE to NOT NULL. No surprise here, but important to know if the table is rather large.

      cheers, Lars

      (0) 
      1. Stefan Koehler

        Hi Lars,

        > Do you know how recently added NOT NULL constraints affect optimizer assumptions when performing flashback queries in your favorite DBMS?

        Yes – in no way at all, because of the flashback query aborts with “ORA-01466: unable to read data – table definition has changed”.

        create table CTEST (a number, b number);

        begin

        for i in 1 .. 100000 loop

          insert into CTEST values (i,i);

        end loop;

        commit;

        end;

        /

        exec DBMS_STATS.GATHER_TABLE_STATS(‘SYS’, ‘CTEST’);

        select b from CTEST where a IS NULL;

        —————————————————————————-

        | Id  | Operation            | Name  | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |

        —————————————————————————-

        |   0 | SELECT STATEMENT  |            |             |             |          59 (100)|             |

        |*  1 |  TABLE ACCESS FULL| CTEST |           1 |          10 |          59   (2)| 00:00:01 |

        —————————————————————————-

        Predicate Information (identified by operation id):

        —————————————————

           1 – filter(“A” IS NULL)

        select sysdate from dual;

        SYSDATE

        ——————-

        17.12.2012 12:04:44

        alter table CTEST modify A NOT NULL;

        select b from CTEST where a IS NULL;

        —————————————————————————–

        | Id  | Operation             | Name  | E-Rows |E-Bytes| Cost (%CPU)| E-Time   |

        —————————————————————————–

        |   0 | SELECT STATEMENT   |             |              |              |            1 (100)|              |

        |*  1 |  FILTER              |             |              |              |                     |              |

        |   2 |   TABLE ACCESS FULL| CTEST |          100K|          976K|           59   (2)| 00:00:01 |

        —————————————————————————–

        Predicate Information (identified by operation id):

        —————————————————

           1 – filter(NULL IS NOT NULL)

        select * from CTEST as of timestamp

        to_timestamp(‘17.12.2012 12:04:44’, ‘DD.MM.YYYY HH:MI:SS’)

        where a IS NULL;

        ERROR at line 1:

        ORA-01466: unable to read data – table definition has changed

                           

        That’s another aspect of my job … spotting the “enemy” from time to time ๐Ÿ˜‰ .. but i return to the big red castle (in my Iron Main suit) now ๐Ÿ˜›

        SAP should get such a great marketing figure as well ๐Ÿ˜‰

        Regards

        Stefan

        (0) 
        1. Lars Breddemann Post author

          Thanks Stefan for checking this.

          (I was too lazy to dig out my old Oracle instance to do it myself… ).

          Meanwhile I checked how HANA handles this: it just let’s the data pass as they have been created in the first place…. ok, let’s not get into this discussion *g*

          cheers, Lars

          (0) 

Leave a Reply