Custom R Component: Bulk Forecasting by Month 2.0
Update July 2016: This article relates only to the Expert modus in SAP Predictive Analytics.
Please see these articles on forecasting in the Automated mode, which allows for instance for additional predictor variables
This component adds the capability to SAP Predictive Analysis to automatically forecast many different monthy time series in one go.
Imagine your company is selling 500 products and you need to forecast the sales quantities for each product for the next 6 months. It would take too much effort to look at each product manually and to forecast its sales quantity individually. This component automates such a forecasting process. It looks at each product individually and based on the product’s history it finds the best forecasting model and configuration for this very product. Once the individual forecast is done for the product, the component will move to the next product to find the model and configuration that describes this product best, and so on. This component aims to automate such a monthly forecasting as much as possible.
Major functonality of this component are
- The user can choose by which column to forecast (country, product, etc.).
- The input data does not have to be aggregated, the component aggregates and sorts the data automatically.
- Missing data (ie no sales of a product in one month) is added to the dataset with measure value of zero. Therefore the component works on transactional data.
- The user can decide how many months to forecast.
- The user can decide which model type to use (AUTO.ARIMA, ETS or an average of multiple models).
- The component can also do a 12-months hold out evaluation to find the best model type, which is then used to forecast.
- The forecast models can be restricted to non-negative forecasts (to avoid negative revenue forecats for instance).
Forecasting sales quantities is only an example. This component works with any kind of monthly data that needs to be forecasted
- Customer Numbers
- Number of Product Returns
- Traffic statistics
- Average monthly weather data
- and many, many more….
This component is heavily building upon the forecast package. Please see the documentation of this package for further information on the forecasting concepts.
If you are new to creating Custom R Components in SAP Predictive Analysis, you can have a look at this overview to get you started. Please note that this code is not supported by SAP. When using this function please carry out your own testing.
This component calculates and compares many different models, hence execution time might be long. Please start with a small dataset that holds only a few time series. I have attached some small sample data on road accidents in Switzerland. You can forecast by either accident location or accident severity.
The historic dataset has numerical columns for year and month.
The R-library forecast is installed.
These parameters can be set by the user..
|Measure to Forecast||Name of column that holds the measure that is to be forecasted.|
|Time Series by||Name of column that identifies individual time series that are to be forecasted (ie forecast by product, country, …).|
|Year||Name of column that holds the year number.|
|Month||Name of column that holds the month number.|
|Forecasting Concept||Concept that is used for the forecast. Possible values: ‘AUTO.ARIMA’, ‘ETS’, ‘Composite’, ’12 Months Hold-Out’|
|Months to Forecast||Number of months that are to be forecasted.|
|Confidence Level||Confidence level for upper and lower prediction intervals, ie 0.9.|
|Chart Type||Chart type that will display the time series. Possible values: ‘Forecast’, ‘Decomposition’|
|Positive Values Only||Restrict forecast to models that produce only non-negative values. Possible values: ‘True’, ‘False’|
|Year||The year of the data point.|
|Month||The month of the data point.|
|YearMonthString||The year and month concatenated into a single string.|
|ForecastBy||Name of the forecasted time series, ie Switzerland, France, USA if the ‘Time Series by” parameter was set to Country.|
|Measure||The measure that is forecasted.|
|Type||Indicates whether the current data point is part of the historic data or whether it was forecasted. Possible values: ‘Actuals’, ‘Forecast’|
|PILower||Lower limit of prediction interval.|
|PIUpper||Upper limit of prediction interval.|
|Model||Describes the forecast model and configuration.|
As an example, let’s forecast the passenger numbers of a transportation company. These are the parameter settings for a forecast by geographic region:
Run the component and you see the results as custom chart. The component can forecast as many time series as you like. However, only a maximum of three will be displayed in the chart. The result of the component will include all time series of course. Notice how each time series is forecasted very differently. The header in the chart shows the name of the time series, ie ‘Middle East’. The sub-header shows the model type and its configuration, ie ‘ETS(A,N,A), followed by a counter.
The same forecast with chart type ‘Decomposition’ shows this result to understand the seasonality, trend and remainder of the original time series. This chart needs more space so that it only displayed for the last time series. If you want to see the chart for a certain time series you can add a filter component in your analysis to reduce the dataset to the relevant time series.
Use a Write Component to save the results for further processing.
You can also try out this component to forecast the number of road accidents hapenning in Switzerland. Just use the file SwissRoadAccidents.csv.
How to Implement
The component can be downloaded as .spar file from GitHub.Then deploy it as described here. You just need to import it through the option “Import/Model Component”, which you will find by clicking on the plus-sign at the bottom of the list of the available algorithms.