SAP HANA SPS 09 – SERIES DATA with My Exercises
CONTENTS:
SERIES DATA TABLE CREATION and GENERATE TIMESTAMP DATA
SERIES DATA TABLE vs REGULAR TABLE – STORAGE and COMPRESSION
EXPLORING SERIES DATA BUILT-IN FUNCTIONS
PREREQUISITES:
- Series Data Column Table “STOCKS_DATA_SERIES” has to be present in Schema “SERIES_DATA“.
[CSV Files (STOCKS_DATA_SERIES.csv) is attached in this Post,
Using Flat File Import Create tables “STOCKS_DATA_SERIES” in schema “SERIES_DATA” in your landscape.]
Exercise 1: Create & Compare the Series table with Column table
Explanation |
Examples / Screenshot |
Step 1:
Creating Regular column Table and Series Data Table. |
Create column Table “SERIES_DATA”.MyTab (key int, ts timestamp, value int); Create column Table “SERIES_DATA”.MyTabSeries (key int, ts timestamp, value int) Series (series key(key) period for series(ts,null) equidistant increment by interval 60 second); |
Step 2:
Inserting Data to Regular column Table and Series Data Table using “SERIES_GENERATE_TIMESTAMP” function. |
Insert into “SERIES_DATA”.“MYTAB” select 1, GENERATED_PERIOD_START, 7 from SERIES_GENERATE_TIMESTAMP (‘INTERVAL 60 SECOND’, ‘2010-01-01’, ‘2011-01-01’, null, null, null); Insert into “SERIES_DATA”.“MYTABSERIES” select 1, GENERATED_PERIOD_START, 7 from SERIES_GENERATE_TIMESTAMP (‘INTERVAL 60 SECOND’, ‘2010-01-01’, ‘2011-01-01’, null, null, null);
|
Step 3:
Execute Merge Delta Operations for both the tables.
|
merge delta of “SERIES_DATA”.“MYTAB”; update “SERIES_DATA”.“MYTAB” with parameters (‘optimize_compression’ = ‘force’);
merge delta of “SERIES_DATA”.“MYTABSERIES”; update “SERIES_DATA”.“MYTABSERIES” with parameters (‘optimize_compression’ = ‘force’); |
Step 4:
Verifying Series Table Storage and Comparing Memory size and Compression with Regular Column table.
|
Select Table_name, column_name, memory_size_in_total, sum(memory_size_in_total) over (partition by table_name) as tab_memory_size, uncompressed_size, sum(uncompressed_size) over (partition by table_name) as tab_uncompressed_size, compression_ratio_in_percentage as ratio, compression_type, “COUNT”, distinct_count from m_cs_columns where table_name in (‘MYTAB’, ‘MYTABSERIES’)
|
Verify both normal column table and series Table.
Normal Column table –> TS Column memory size –> 5 MB
Series Table –> TS Column memory size –> 10 KB
Now You can understand How efficiently Series Table stores the data for Time values.
Exercise 2: Series Table Built-in and Analytic Functions:
Explanation |
Examples / Screenshot |
Step 1:
Check the data Preview of Series Data Table “STOCKS_DATA_SERIES” |
STOCKS_DATA_SERIES (SERIES TABLE): The table (“SERIES_DATA”.”STOCKS_DATA_SERIES”) is having Stock Market data with values (TICKER_ID, TICKER_DESCRIPTION, DATE, OPEN, HIGH, LOW, CLOSE, VOLUME, ADJ_CLOSE and DAILY_PERCENT_CHANGE) Since Year 1959 to 2015 (all the business days) for Deutscher Aktien Index (DAX). Total no.of rows in the Table 13895. SQL QUERY TO CHECK THE DATA: |
Step 2:
Exploring Series Data Built-in Functions: SERIES_DISAGGREGATE |
SERIES_DISAGGREGATE (Built-in Function):
Transforming an equidistant time series with a coarser delta to one with a finer delta can be performed using the SERIES_DISAGGREGATE function.
SQL Query:
We have data on daily basis. We are going to disaggregate data to hourly basis from Daily.
select * from ( SELECT s.DATE, s.close * g.FRACTION_OF_SOURCE_PERIOD AS “By Hour Close” FROM “SERIES_DATA”.“STOCKS_DATA_SERIES” AS s LEFT JOIN SERIES_DISAGGREGATE_TIMESTAMP (‘INTERVAL 1 DAY’, ‘INTERVAL 1 HOUR’, ‘2015-01-19’, ‘2015-01-20’) g ON s.DATE = g.SOURCE_PERIOD_START ) where DATE = ‘2015-01-19’; |
Step 3:
Exploring Series Data Built-in Functions: SERIES_ROUND
|
SERIES_ROUND (Built-in Function): Horizontal aggregation transforms an equidistant series with a narrower interval to a new series with a coarser interval. Horizontal Aggregation functionality performed using the SERIES_ROUND function. SQL Query:
We have data on daily basis. We are going to Aggregate data to monthly basis from Daily.
Select rounded.TICKER_ID, Month, Avg(CLOSE) as Monthly_avg from ( select
SERIES_ROUND(DATE, ‘INTERVAL 1 MONTH’, ROUND_DOWN) As Month, CLOSE from “SERIES_DATA”.“STOCKS_DATA_SERIES” As t ) As rounded Group By rounded.TICKER_ID, Month |
Summary:
You have completed the exercise!
You are now able to:
1) Create the Time Series Table.
2) Understand the Storage of Series Data Table.
3) Usage of Series Data Built-in Functions.
Thanks Muthuram,
Would it be possible to share the table definition / create table statement for the STOCKS_DATA_SERIES?
I tried using the HANA Studio import and it guessed the datatypes incorrectly, as I had some errors "
Batch from record 9002 to 12001 failed: [314]: numeric overflow: numeric overflow: 227260272: type_code=3, index=8numeric overflow: 167846160: type_code=3,"
Batch from record 12002 to 13897 failed: For input string: "2432737024": For input string: "2432737024"
Thanks, Ian.
Hi Ian,
Please try this query to create the Table.
CREATE COLUMN TABLE "SERIES_DATA"."STOCKS_DATA_SERIES" ("TICKER_ID" BIGINT CS_FIXED, "TICKER_DESCRIPTION" VARCHAR(5), "DATE" DAYDATE CS_DAYDATE, "OPEN" DOUBLE CS_DOUBLE, "HIGH" DOUBLE CS_DOUBLE, "LOW" DOUBLE CS_DOUBLE, "CLOSE" DOUBLE CS_DOUBLE, "VOLUME" DOUBLE CS_DOUBLE, "ADJ_CLOSE" DOUBLE CS_DOUBLE, "DAILY_PERCENT_CHANGE" DOUBLE CS_DOUBLE) UNLOAD PRIORITY 5 AUTO MERGE
Thanks that now works perfectly.
Cheers, Ian.
Why do not we create a Series Table for data from CSV file, but work wit ha regular Table instead ? I got from Exercise 1 that we should aim for Series Table in this case..
Hi Muthuram,
Thanks for the very nice blog.
I am trying to use this function and getting the error message -
Could not execute 'SELECT *, MONTHNAME(TRADEDATE), YEAR(TRADEDATE), g.generated_period_start AS generated_day FROM ...' in 211 ms 357 µs .
SAP DBTech JDBC: [328] (at 148): invalid name of function or procedure: SERIES_DISAGGREGATE_TIMESTAMP: line 3 col 13 (at pos 148)
is there something i need to activate or install to be able to use the function?
Hi Kayode,
Can you please let me know the HANA SP version you are using?
Regards,
Muthuram
Hi Muthuram,
We are on sps7. I see that this might be a new feature on sps 09. is there another strategy to implement this on sps 07?
Regards
Kayode
Hi Kayode,
SERIES DATA is SP09 Feature and It will be supported only in SP09.
So It is not possible implement in SP07.
Regards,
Muthuram
Thank you. I will consider other options.
Hi Muthuram,
I have created a SERIES table with the following query:
Create column Table "SERIES_DATA"."MYTAB_SERIES"
(id int, ts timestamp, value int)
SERIES (SERIES KEY(id) EQUIDISTANT INCREMENT BY INTERVAL 60 SECOND period for series(ts) );
Now when I try to insert with the following query:
Insert into "SERIES_DATA"."MYTAB_SERIES" select 1, GENERATED_PERIOD_START, 7 from SERIES_GENERATE_TIMESTAMP ('INTERVAL 60 SECOND', '2010-01-01', '2011-01-01', null, null, null);
I get an error like this.
Could not execute 'Insert into "SERIES_DATA"."MYTAB_SERIES" select 1, GENERATED_PERIOD_START, 7 from ...'
SAP DBTech JDBC: [403]: internal error: Exception CompileServerNotFoundException: No details while compiling trigger SERIES_DATA._SYS_TRIGGER_SERIES_153229_#1_#, messages:
However in the normal table, the insert works. Anything I am missing here?
Regards,
Subhankar
Hi Subhankar,
I have executed the query now. It is working fine for me.
Can you please let me know which HANA version are you using?
It is SP09 only. Well just saw that for my HANA system, compileserver is not working as shown in HANA Studio. That could be the reason for this I guess.
sounds pretty much like the problem here.
Can you actually compile **anything** ??
Yes my other queries were running! Now I can't check anymore as I had the system configured in a remote VM which seems to have been terminated. 😯
Hi Lars, Hi Muthuram,
I tried now in a different system and it worked. So it was surely system problem. Though I don't know how it happened.
Anyways, thank you for your help. 🙂
Regards,
Subhankar
This doesn't reproduce on my rev. 91.
One piece of the SERIES feature is that it places a AFTER INSERT trigger onto the table and this trigger simply checks for the inserted data to actually comply to the EQUIDISTANT constraint.
Apparently this trigger cannot be compiled in your case.
Can you execute
select SERIES_ROUND(current_timestamp, 'INTERVAL 60 second') from dummy
?
- Lars
...
period for series(ts,null)
---> what does null stand for here?
I don't see "period for series()" in the SPS09 documentation.