Custom R Component – Forecast Daily Time Series
This component forecasts a daily time series using ARIMA.
Should the historic data contain multiple entries for the same day, then the values are summarised by day. Missing days are added with a measure of 0 to ensure the time series is complete.
Please note that “Automated Analytics” contains a strong time-series forecast, which can also take additional predictor variables into account (ie bank holidays, weather data, etc.). The component described in this article is only looking at the date and measure.
Please note that this component is not an official release by SAP and that it is provided as-is without any guarantee or support. Please test the component to ensure it works for your purposes.
- One column must hold the date stamp as String value.
- The data must include at least 2 seasonal cycles. This measn for instance, that if the data contains a yearly pattern/seasonality, you must have at least 2 year’s of history.
- R libraries zoo, forecast and gplots must be installed.
Please let me know should you encounter any limitations.
These parameters can be set by the user.
|Measure to Forecast||
The historic measure to forecast.
|Date Column||The column that stores the date in String format, ie 03.11.2015.|
Specifies the string column’s date format in R notation. For instance: %d.%m.%Y
See the documentation of the as.Date() function for the syntax.
|Seasonality Frequency||The seasonal pattern in the data, ie 7 for weekly or 365.25 for yearly.|
|Days to Forecast||The number of days to forecast.|
|Confidence Level||Confidence level for upper and lower prediction intervals, ie 0.95.|
|DateString||The date the measure refers to.|
|Type||Indicates whether the current data point is part of the historic data or whether it was forecasted. Possible values: ‘Actuals’, ‘Forecast’|
|Measure||The measure that is forecasted.|
|PILower||Lower limit of prediction interval.|
|PIUpper||Upper limit of prediction interval.|
How to Implement
The component can be downloaded as .spar file from GitHub. Then deploy it as described here. You just need to import it through the option “Import/Model Component”, which you will find by clicking on the plus-sign at the bottom of the list of the available algorithms.
You can try out the component with the dataset WikipediaPageViewsChocolate.csv. The file lists the daily page views of the Wikipedia page for Chocolate for the period from 3 August 2015 to 10 October 2015. This data was downloaded from this traffic statistics site.
It turns out the data has a strong weekly pattern. Most page views occur on Wednesday or Thursday, whilst on Saturday the numbers are mainly at the lowest. Maybe the chocolate gets people through the working week…
Load the data into “Expert Mode” and add the “Forecast Daily Time Series” component to the analysis. Configure it as follows:
Run the component and click the “Custom Chart” icon. You see
- a plot with the historic data forecasted for the next 7 days, together with the prediction interval
- details of the chosen ARIMA model
Click on the “Data Grid” and scroll down to see the exact forecasts.
Finally, you could add another component to the analytical flow to further process or output the data.