Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
markmumy
Advisor
Advisor
Are you looking for a programatic way to monitor the size of your data lake objects?  How about viewing the row counts of the objects as well?

As you will have seen from my previous blog posts about the SAP HANA Data Lake, it is based on SAP IQ (formerly Sybase IQ) technology.  Some of the monitoring that we would have done in years past in an SAP IQ landscape can be ported to work with the SAP HANA Data Lake.

The basic premise is simple: develop code that runs inside the data lake to capture the proper statistics and then expose that code via a stored procedure.  There are (or will be when the next evolution of the data is released in Q1 2021 per the roadmap), two ways to interface with the data lake.  Since Q1 2020, the data lake has been exposed through SAP HANA Cloud.  In the next generation of the data lake we will be exposing a native SQL endpoint or interface.  This blog will cover both interfaces.

First, the code.  As mentioned, we must develop code which captures the proper statistics.  In this case, we want to capture the size of each table as well as the number of rows in the table.  This can be done via a SQL stored procedure that is created in the data lake.

If you will be exposing the data lake via SAP HANA Cloud, this code should be used.  It puts the proper syntax and wrappers around the code so that HANA can send it to the data lake for creation.
CALL SYSRDL#CG.REMOTE_EXECUTE('
drop procedure if exists sp_iqsizes;

create procedure sp_iqsizes(
in_tbl varchar(255) default ''%'',
in_own varchar(255) default ''%'',
in_size varchar(2) default ''kb''
)
begin
declare local temporary table size_res (
table_owner varchar(128)
, table_name varchar(128)
, size double
, rowcount unsigned bigint
) in SYSTEM;
declare sizeKB double;
declare rc unsigned bigint;
declare blksz unsigned int;
declare size_factor double;
declare size_name varchar(15);

select first block_size/512/2 into blksz from SYS.SYSIQINFO;

if lower ( in_size ) = ''tb'' then
set size_factor = 1024*1024*1024;
elseif lower ( in_size ) = ''gb'' then
set size_factor = 1024*1024;
elseif lower ( in_size ) = ''mb'' then
set size_factor = 1024;
elseif lower ( in_size ) = ''kb'' then
set size_factor = 1;
else
message ''in_size must be kb, mb, gb, or tb'' to client;
end if;
set size_name = ''size in ''+upper( in_size );

FOR FORLOOP as FORCRSR CURSOR FOR select table_name, user_name table_owner
from sys.systable, sys.sysuser where creator <> 3 and server_type = ''IQ''
and creator=user_id
and table_type <> ''PARTITION''
and table_type <> ''MAT VIEW''
and lower( suser_name( creator ) ) like lower( in_own )
and lower( table_name ) like lower ( in_tbl )
do
set sizeKB=0;
execute immediate ''select convert(double, NBlocks * blksz) / convert( double, size_factor ) into sizeKB from sp_iqtablesize(''''"''||table_owner||''"."''||table_name||''"'''');'';
execute immediate ''select count(*) into rc from "''||table_owner||''"."''||table_name||''";'';
execute immediate ''insert into size_res select "table_owner", "table_name", sizeKB, rc'';
end for;

message '''' to client;
message ''***** HDLRE Server: '' + @@servername to client;
message ''***** Owner filter: '' + in_own to client;
message ''***** Object filter: '' + in_tbl to client;
message ''***** Size Factor: '' + in_size to client;
message '''' to client;
execute immediate with result set on ''select table_owner, table_name, convert( varchar(60), convert( numeric(20,4), size) ) as '''''' + size_name + '''''', commas_int( rowcount ) as RowCount from size_res order by 1,2'';
end;

drop view if exists V_HDL_ALL_SIZES;
create view V_HDL_ALL_SIZES as select * from sp_iqsizes();

');

 

If you will be using the SQL Endpoint for the HANA Data Lake, then this code would be used.  The only difference is the REMOTE_EXECUTE wrapper and the single quote escaping that is done by using two single quotes.
drop procedure if exists sp_iqsizes;

create procedure sp_iqsizes(
in_tbl varchar(255) default '%',
in_own varchar(255) default '%',
in_size varchar(2) default 'kb'
)
begin
declare local temporary table size_res (
table_owner varchar(128)
, table_name varchar(128)
, size double
, rowcount unsigned bigint
) in SYSTEM;
declare sizeKB double;
declare rc unsigned bigint;
declare blksz unsigned int;
declare size_factor double;
declare size_name varchar(15);

select first block_size/512/2 into blksz from SYS.SYSIQINFO;

if lower ( in_size ) = 'tb' then
set size_factor = 1024*1024*1024;
elseif lower ( in_size ) = 'gb' then
set size_factor = 1024*1024;
elseif lower ( in_size ) = 'mb' then
set size_factor = 1024;
elseif lower ( in_size ) = 'kb' then
set size_factor = 1;
else
message 'in_size must be kb, mb, gb, or tb' to client;
end if;
set size_name = 'size in '+upper( in_size );

FOR FORLOOP as FORCRSR CURSOR FOR select table_name, user_name table_owner
from sys.systable, sys.sysuser where creator <> 3 and server_type = 'IQ'
and creator=user_id
and table_type <> 'PARTITION'
and table_type <> 'MAT VIEW'
and lower( suser_name( creator ) ) like lower( in_own )
and lower( table_name ) like lower ( in_tbl )
do
set sizeKB=0;
execute immediate 'select convert(double, NBlocks * blksz) / convert( double, size_factor ) into sizeKB from sp_iqtablesize(''"'||table_owner||'"."'||table_name||'"'');';
execute immediate 'select count(*) into rc from "'||table_owner||'"."'||table_name||'";';
execute immediate 'insert into size_res select "table_owner", "table_name", sizeKB, rc';
end for;

message '' to client;
message '***** HDLRE Server: ' + @@servername to client;
message '***** Owner filter: ' + in_own to client;
message '***** Object filter: ' + in_tbl to client;
message '***** Size Factor: ' + in_size to client;
message '' to client;
execute immediate with result set on 'select table_owner, table_name, convert( varchar(60), convert( numeric(20,4), size) ) as ''' + size_name + ''', commas_int( rowcount ) as RowCount from size_res order by 1,2';
end;

 

Regardless of which code path you have used, the net is that the data lake will now have a procedure called sp_iqsizes() and view called V_HDL_ALL_SIZES.  When run, sp_iqsizes takes no parameters and will collect the size in KB/kilobytes and the row count for each table in the data lake that is not a system table.

The view is needed to expose the code logic (stored procedure) to a remote system, SAP HANA Cloud.

If you wish to expose the data lake procedure and view to SAP HANA Cloud, this code can be used to do just that:
drop table HDL_ALL_SIZES;

CREATE VIRTUAL TABLE "HDL_ALL_SIZES"
AT "SYSRDL#CG_SOURCE"."<NULL>"."SYSRDL#CG"."V_HDL_ALL_SIZES";

At this point, we have created the necessary data lake objects as well as mapped those through HANA Cloud for programmatic use in HANA Cloud.  To access the data from within HANA Cloud, you can run this command:
select * from HDL_ALL_SIZES;

 

If you are using the SQL Endpoint to the HANA Cloud, data lake, you can run this command directly in the data lake:
call sp_iqsizes();

 

Regardless of which path and command you execute, the data output will be the same.  I ran this via the HANA Cloud Database Explorer and have this as output in the test system:


HANA Cloud DBExplorer Output


Hope this helps you in better monitoring the space consumption of your objects, programmatically, in your data lake deployment.  As an added benefit, if you are an existing SAP IQ user, the stored procedure will also work in an on-premise deployment of SAP IQ 16.
1 Comment