Introduction
Smoothing algorithms are basically used in time series data either to produce smoothed data for presenting the trend of the data or to forecast it for what if analysis. Time series data are sequential observations from the history of data with respect to a series of date, time or time stamps. Moving average analysis is also the same kind of time series analysis where the past observations are equally weighted. But, for certain analysis (price movement or share market stock movement), the recent past data has the most weight. Exponential smoothing assigns exponentially decreasing weights over time.
Pre requisites:
 No missing/null data
 Only numeric data can be smoothed
Algorithm
Let S_{t }be the smoothed value for the t’th time period and X_{n }(1, 2, 3…n) be the time series; mathematically,
S1 = x0 (you cannot get the smoothed value for the first entry in the time series).
St = αx_{t−1} + (1−a) S_{t−1} where α is the smoothing factor (in % Mathematically 0 < α < 1). When Alpha tends to 0, the weight given to the history value reduced and tends to 0.
Let us consider an example of price trend of a particular material over a series of date range.
DAY 
PRICE 
2JUNE2014 
100 
3JUNE2014 
95 
4JUNE2014 
110 
5JUNE2014 
110 
6JUNE2014 
98 
7JUNE2014 
Holiday 
8JUNE2014 
Holiday 
9JUNE2014 
105 
10JUNE2014 
118 
There is no data available for 7^{th} June and 8^{th} June. But, as per the algorithm null or missing series are not allowed. In this case, if the previous data is not available corresponding smoothed value will be taken as X_{t }for those entries.
Now, let us convert the input table into time series and manually apply the smoothing algorithm, with smoothing factor as 50 %( 0.5). Consider the first DAY as the base date.
DAY 
Time 
PRICE 
Smoothed Value ( S_{t}) 
2JUNE2014 
0 
100 

3JUNE2014 
1 
95 
100 
4JUNE2014 
2 
110 

5JUNE2014 
3 
110 

6JUNE2014 
4 
98 

7JUNE2014 
5 
Holiday 

8JUNE2014 
6 
Holiday 

9JUNE2014 
7 
105 

10JUNE2014 
8 
118 

Calculation
 S(0) will be null
 S(1) will be , 0.5 * 100(which is X_{t1}) + 0.5 * 100 = 100
 S(2) will be , 0.5 * 95 + 0.5 * 100 = 97.5
 S(3) will be , 0.5 * 110 + 0.5 * 97.5 = 103.75
 S(4) will be ,05.*110 + 0.5 * 103.75 = 106.875
 S(5) will be , 0.5 * 98 + 0.5 * 106.875 = 102.4375
 S(6) will be , 0.5 * 102.4375(previous time series values in not available, hence the smoothed value is considered ) + 0.5 * 102.4375 = 102.4375
 The same process continues for the upcoming entries and can be forecast up to n number of time series entries.
Graph simulated with smoothed data in Microsoft Excel
PAL Implementation
(Source code from SAP HANA PAL Document is re used to generate the below code snippet – Page 193 ).
CREATE SCHEMA PAL_TRY;
SET SCHEMA PAL_TRY;
dropping existing procedures if any
CALL SYSTEM.AFL_WRAPPER_ERASER('SINGLESMOOTH_TEST_PROC');
Creating procedures
CALL SYSTEM.AFL_WRAPPER_GENERATOR('SINGLESMOOTH_TEST_PROC','AFLPAL','SINGLESMOOTH',PAL_SINGLESMOOTH_PDATA_TBL);
CREATE LOCAL TEMPORARY COLUMN TABLE #PAL_CONTROL_TBL ("NAME" VARCHAR(100),
"INTARGS" INT, "DOUBLEARGS" DOUBLE, "STRINGARGS" VARCHAR(100));
RAW_DATA_COL : column where the data is available
INSERT INTO #PAL_CONTROL_TBL VALUES ('RAW_DATA_COL',1,NULL,NULL);
Alpha value
INSERT INTO #PAL_CONTROL_TBL VALUES ('ALPHA', NULL,0.5,NULL);
Forecast_num : Forecast next 100 values(Includes future values )
INSERT INTO #PAL_CONTROL_TBL VALUES ('FORECAST_NUM',100, NULL,NULL);
STARTTIME : ID starts from 0
INSERT INTO #PAL_CONTROL_TBL VALUES ('STARTTIME',0, NULL,NULL);
CREATE COLUMN TABLE PAL_SINGLESMOOTH_DATA_TBL LIKE PAL_SINGLESMOOTH_DATA_T ;
Loading the test Data
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (0,100.0);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (1,95.0);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (2,110.0);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (3,110.5);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (4,98.0);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (7,105.0);
INSERT INTO PAL_SINGLESMOOTH_DATA_TBL VALUES (8,118.0);
CREATE COLUMN TABLE PAL_SINGLESMOOTH_RESULT_TBL LIKE PAL_SINGLESMOOTH_RESULT_T;
Executing the procedure
CALL _SYS_AFL.SINGLESMOOTH_TEST_PROC(PAL_SINGLESMOOTH_DATA_TBL,"#PAL_CONTROL_TBL", PAL_SINGLESMOOTH_RESULT_TBL) WITH OVERVIEW;
SELECT * FROM PAL_SINGLESMOOTH_RESULT_TBL;
The output table PAL_SINGLESMOOTH_RESULT_TBL will contain the smoothed data which can be used for the analysis or presentation purposes.
Regards
Sreehari V Pillai
Thank you, Sreehari!
smoothing algorithms require (TIME_ID, VALUE) table structure for the source data. Do you have an idea how to generate this integer TIME_ID which starts from 0, if your source table only has DATE and VALUE, like in your example?
I talk about the case when we want to apply PAL function to an existing table/view.
Thank you one more time,
Kostia Kharchenko
Hello Kostia, I am not sure if this will be helpful in your scenario – however, what I have done in some cases in the past is convert the DATE to a integer value. I do this by creating a View on top of my existing table, and then subsequently using the view as the input into the PAL function.
Here is an example,
DT_ADDED is my LONGDATE field.
SELECT TO_INT(TO_CHAR(UTCTOLOCAL(DT_ADDED, ‘” + strTimeZone + “‘), ‘HH24’)) ID
Hi Paul, nice to see you here!
Thank you, I might go on with your solution.
Thought about this before, but couldn’t figure out the conversion function.
UPD: unfortunately, it doesn’t fit in my scenario, because i convert dates YYYYMMDD and they are not continuous, ex: 20141231 + 1 will be 20141232
still thank you for the advice!
Hi Kostial,
For your scenario, you can consider the least value of the date column and make it as the base date. For eg, let Jan 1 ,2014 be your base date so that integer value corresponding to Jan 1,2014 will be 0. For other dates comes after this, find the number of days between the base date and the current value. For eg, feb 1,2014. Here, your integer value will be 31. Like that you can follow.
@Paul : Thanks for your view on this 🙂
Sreehari.
Hi,
is there a possibility to find out with SAP Predictive Analysis which smoothing method (1st, 2nd or 3rd) is suitable for the existing time series data set?
How can we optimize the method in order to alpha?
Whats about the figure goodness of fit – which statement does this figure deliver and how it has been calculated?
Thanks for answers
Hi Can,
That’s a nice question by the way. Generally you are asking two questions. one, how to choose which exponential smoothing method is to be chosen for a time series . Second, how to choose values for alpha for a better smoothing.
How to choose right values for alpha : As I said, for lesser values of Alpha , the weightage you are giving to your historical data is reduced. For example , market price forecast. you are not bothered about what was the value of a commodity past one year, but just past months or just yesterday. So, here the alpha value would tend to zero.
How to choose the model : You have to apply forecasting on your time series and find out the deviation from original value. That is , suppose you have a time series with time values 0 to 100. Consider 0 to 50 values and forecast till 100. Find the deviation from original value for all the three models. Which model is giving the closer forecast to your time series, you can choose it . May be I will be writing a blog on Choosing the right model for PAL.
Sreehari V Pillai
Hi Sreehari,
first of all great thanks for answer. So if i have understood it correctly deciding the suitable smoothing method is based on “learning by trying”. Are there no statistical figures like autocorrelationcoefficient which gives out the attributes of the time series data like trends, or saisonable characteristics?
My next questions are: How does the figure “Goodness of fit” is been calculated? Thanks in advance
If I understood the smoothing algorithms correctly, these algorithms basically simulate the trend and seasonal characteristics . Have a look into the linear regression – Linear regression – Wikipedia, the free encyclopedia (I m sure it will take months to understand these – most times I failed 🙂 ).
For the goodness of fit calculation, there are standard algorithms available ( like KS algorithm ) . I don’t think HANA PAL provides those algorithms . Let me check for the alternative features if any.
Sreehari