Skip to Content
Author's profile photo Andreas Forster

Custom R Component: Bulk Forecasting by Month 2.0

Update July 2016: This article relates only to the Expert modus in SAP Predictive Analytics.

Please see these articles on forecasting in the Automated mode, which allows for instance for additional predictor variables


This component adds the capability to SAP Predictive Analysis to automatically forecast many different monthy time series in one go.

forecast.JPG

Background

Imagine your company is selling 500 products and you need to forecast the sales quantities for each product for the next 6 months. It would take too much effort to look at each product manually and to forecast its sales quantity individually. This component automates such a forecasting process. It looks at each product individually and based on the product’s history it finds the best forecasting model and configuration for this very product. Once the individual forecast is done for the product, the component will move to the next product to find the model and configuration that describes this product best, and so on. This component aims to automate such a monthly forecasting as much as possible.

Major functonality of this component are

  • The user can choose by which column to forecast (country, product, etc.).
  • The input data does not have to be aggregated, the component aggregates and sorts the data automatically.
  • Missing data (ie no sales of a product in one month) is added to the dataset with measure value of zero. Therefore the component works on transactional data.
  • The user can decide how many months to forecast.
  • The user can decide which model type to use (AUTO.ARIMA, ETS or an average of multiple models).
  • The component can also do a 12-months hold out evaluation to find the best model type, which is then used to forecast.
  • The forecast models can be restricted to non-negative forecasts (to avoid negative revenue forecats for instance).

Forecasting sales quantities is only an example. This component works with any kind of monthly data that needs to be forecasted

  • Revenue
  • Customer Numbers
  • Number of Product Returns
  • Traffic statistics
  • Average monthly weather data
  • and many, many more….

This component is heavily building upon the forecast package. Please see the documentation of this package for further information on the forecasting concepts.

If you are new to creating Custom R Components in SAP Predictive Analysis, you can have a look at this overview to get you started. Please note that this code is not supported by SAP. When using this function please carry out your own testing.

This component calculates and compares many different models, hence execution time might be long. Please start with a small dataset that holds only a few time series. I have attached some small sample data on road accidents in Switzerland. You can forecast by either accident location or accident severity.

Prerequisites

The historic dataset has numerical columns for year and month.

The R-library forecast is installed.

Usage

These parameters can be set by the user..

Parameter Description
Measure to Forecast Name of column that holds the measure that is to be forecasted.
Time Series by Name of column that identifies individual time series that are to be forecasted (ie forecast by product, country, …).
Year Name of column that holds the year number.
Month Name of column that holds the month number.
Forecasting Concept Concept that is used for the forecast. Possible values: ‘AUTO.ARIMA’, ‘ETS’, ‘Composite’, ’12 Months Hold-Out’
Months to Forecast Number of months that are to be forecasted.
Confidence Level Confidence level for upper and lower prediction intervals, ie 0.9.
Chart Type Chart type that will display the time series. Possible values: ‘Forecast’, ‘Decomposition’
Positive Values Only Restrict forecast to models that produce only non-negative values. Possible values: ‘True’, ‘False’

Output Columns

Column Description
Year The year of the data point.
Month The month of the data point.
YearMonthString The year and month concatenated into a single string.
ForecastBy Name of the forecasted time series, ie Switzerland, France, USA if the ‘Time Series by” parameter was set to Country.
Measure The measure that is forecasted.
Type Indicates whether the current data point is part of the historic data or whether it was forecasted. Possible values: ‘Actuals’, ‘Forecast’
PILower Lower limit of prediction interval.
PIUpper Upper limit of prediction interval.
Model Describes the forecast model and configuration.

Example

As an example, let’s forecast the passenger numbers of a transportation company. These are the parameter settings for a forecast by geographic region:

./wp-content/uploads/2014/01/forecastconfig_361603.jpg

Run the component and you see the results as custom chart. The component can forecast as many time series as you like. However, only a maximum of three will be displayed in the chart. The result of the component will include all time series of course. Notice how each time series is forecasted very differently. The header in the chart shows the name of the time series, ie ‘Middle East’. The sub-header shows the model type and its configuration, ie ‘ETS(A,N,A), followed by a counter.

forecast.JPG

The same forecast with chart type ‘Decomposition’ shows this result to understand the seasonality, trend and remainder of the original time series. This chart needs more space so that it only displayed for the last time series. If you want to see the chart for a certain time series you can add a filter component in your analysis to reduce the dataset to the relevant time series.

decomp.JPG

Use a Write Component to save the results for further processing.

save.JPG

You can also try out this component to forecast the number of road accidents hapenning in Switzerland. Just use the file SwissRoadAccidents.csv.

How to Implement

The component can be downloaded as .spar file from GitHub.Then deploy it as described here. You just need to import it through the option “Import/Model Component”, which you will find by clicking on the plus-sign at the bottom of the list of the available algorithms.

Assigned Tags

      20 Comments
      You must be Logged on to comment or reply to a post.
      Author's profile photo Former Member
      Former Member

      Hi Andreas,

      Thanks for sharing the component. I tried using it as described in the doc, I could see the component in the component list after opening PA. I tried to use it one of the analysis but the PA just hanged. After closing it using Task Manager I tried to reopen it but it just hangs and does not open the analysis. It is also not opening other analysis where I used Custom R components created by me. It only opens the models where I have use the Standard PA algorithms.

      Something has gone wrong after getting that component in the directory mentioned?

      Let me know if anything can be done?

      Author's profile photo Andreas Forster
      Andreas Forster
      Blog Post Author

      Hello Bimal, Most likely the component was still calculating. It calculates and compares many different models. If your dataset has many time series, it might appear as if SAP PA was hanging, whilst it is actually still calculating.

      I have just attached some small sample data on Swiss road accidents. Please send me a private message in case this is not working for you and I will investigate.

      Greetings

      Andreas

      Author's profile photo Former Member
      Former Member

      Hi Andreas

      I've installed your package but the forecasting one will not run as it requires version 3.02 of R.  I've tried to get an earlier version of the forecast that will run with 3.01 of R, but no luck.  Can you advice please?

      Cheers

      Jon

      Author's profile photo Andreas Forster
      Andreas Forster
      Blog Post Author

      Hello Jon,

      The code was created with the forecast package version 4.8. You can download this older windows package here

      http://cran.r-project.org/bin/windows/contrib/2.15/forecast_4.8.zip

      I have go this package working jsut fine on one computer with SAP PA 1.15 and R 3.0.1. On antother machine however, R 3.0.1. refuses to work with this version. So far I have to admitt I am not sure why the second machine isn't happy.

      Andreas

      Author's profile photo Former Member
      Former Member

      Hi Andreas

      I tried this version but experienced the same issue as you did with the second machine...so I'm kind of stuck.  Maybe PA should upgrade to R 3.02.

      Thanks for your help

      Author's profile photo Andreas Forster
      Andreas Forster
      Blog Post Author

      Hello Jon,
      I have got a solution now. These steps made it work on three different machines. Just make sure first of all to uninstall the forecast package that you have at the moment.

      Step 1: Download the forecast package 4.9 as listed on the website of the package's developer. See bullet point 5 on
      http://robjhyndman.com/hyndsight/old-r-packages/

      Step 2: Copy the downloaded forecast_4.9.zip into C:\

      Step 3: Install the package with this command in R:
      install.packages("C:\\forecast_4.9.zip")

      Step 4: Install the forecast package's dependencies. If you are not using a proxy, you might not need the first command
      setInternet2(use = TRUE)
      install.packages("fracdiff")
      install.packages("tseries")
      install.packages("RcppArmadillo")

      Step 5: Test if the forecast package is installed correctly. This statement should now give a Warning, which is ok.
      forecast(library)

      You can then test the Custom R Component in SAP PA.

      Let me know please if this makes it work for you as well.
      Cheers
      Andreas

      Author's profile photo Former Member
      Former Member

      Hi Andreas,

      I tried the steps mentioned by you, but I get the following error now.

      "Error: package ‘forecast’ was built before R 3.0.0: please re-install it"

      I had previously got the forecast package working by manually installing the tar.gz file.

      But now that is also not working.

      Any idea how to resolve this?

      Thanks,

      Bimal

      Author's profile photo Former Member
      Former Member

      Hi Andreas,

      Thanks for your help.  The forecast package installed with just a warning as you pointed out.  However I now have the problem where using any R algorithm crashes PA without warning . I've checked the logs but nothing really stands out as a possible cause.  I have uninstalled R and then PA, cleaned the registry, deleted all content that pertains to R and PA that I know of.  Reinstalled PA and the R and tried again without success.

      It's a shame as I'd to demonstrate this to a customer using there data and using your bulk forecasting algorithm.

      Any suggestions?

      Thanks

      Jon

      Author's profile photo Andreas Forster
      Andreas Forster
      Blog Post Author

      Hello Jon,

      I have never seen that kind of behaviour you describe. Maybe you have to execute a remove package command. That made it work for Bimal, but then he had a different issue.

      remove.packages(“forecast”)

      Can you use a test machine to

      - install a fresh SAP PA

      - install / configure R 3.0.1 from within SAP PA

      - Then add the forecast package as described above

      install.packages("C:\\forecast_4.9.zip")

      setInternet2(use = TRUE)

      install.packages("fracdiff")

      install.packages("tseries")

      install.packages("RcppArmadillo")

      forecast(library)

      Author's profile photo Former Member
      Former Member

      Hi Andreas,

      Your solution worked for me. I uninstalled the older ones and removed all old files from the lib folder.

      I can use the package now.

      Author's profile photo Former Member
      Former Member

      Hi Andreas

      I thought I'd replied but it doesn't appear here.  Yes it worked fine on a VM using your Swiss accident data, only graphs though no data grid.  Is there something that needs to be enabled for the production of the data grid?

      Also I'm not making any progress with R crashing PA, there has to be something in the registry that ensures the corrupt code continues to be used.  Don't really want to reload the OS to get around this.

      Cheers

      Jon

      Author's profile photo Former Member
      Former Member

      Hi Andreas

      Have got the R algorithms working now.  I had to delete everything in the registry pertaining to R, all folders on my local drive with R content and re-booted my PC. Got PA to install R and the configured it and then ran R algorithms and your custom solution successfully.  Still don't get a data grid though.

      Thanks

      Jon

      Author's profile photo Andreas Forster
      Andreas Forster
      Blog Post Author

      Hello Jon,

      I am glad you got it working now!

      The data grid is on the Results tab.Here on the right hand side click the "Bulk Forecasting" component. Here is a screenshot of what it should look like.

      Let me know in case this shows up differently for you?

      Greetings

      Andreas

      datagrid.JPG

      Author's profile photo Former Member
      Former Member

      Hi Andreas,

      Thanks verymuch for sharing the component. I did all the steps and started running the algorithm but ended with error

      Bulk Forecasting by month 2.0: An error occured whule executing commands in the R environment.

      Details:

      Cause : error from RL " error in estmodel(y, errortype[i], trendtype[j], seasontype[k], damped[i], :

      Function cannot be evaluated at initial parameters"

      Do you by any chance know more about this error?

      Thanks in advance for your help.

      Shrirama

      Author's profile photo Former Member
      Former Member

      By the By, I was able to run the Swiss road accidents analysis successfully. What you think could be wrong in my data? I have three years sales per material by year by month.

      Author's profile photo Andreas Forster
      Andreas Forster
      Blog Post Author

      Hi  Shrirama,

      Please check that the columns used in the model's configuration have the same data type as the columns from the Swiss data, ie: measure, time series by, year and month.

      You can see this for instance on the prepare tab. The measure column should NOT have "ABC" next to it. It should say "123" to indicate the column is seen as numeric.

      Greetings

      Andreas

      Author's profile photo Former Member
      Former Member

      Hi Andreas

      Thanks for this amazing component, it give us so many options of algorithms to choose from. Its working fine in SAP PA 2.3. Just wanted to check with you one thing like if I want to check the model accuracy and error margin after applying it to my data, How will I do that? Because the Model Statistics Component doesn't work with this? Any idea on this ? In the below output I can't really make out, whats the error margin...


      Capture.PNG



      Regards

      Ranajay

      Author's profile photo Andreas Forster
      Andreas Forster
      Blog Post Author

      Hi Ranajay,

      Thank you for the feedback. Nice to hear you like the component!

      It does work with PA 2.3. To compare the forecast accuracy would require something separate, maybe another Custom R element.

      Saying this, the "Bulk Forecasting" component was created before SAP acquired KXEN. This is now Automated Analytics within SAP PA and comes with its own dedicated time series forecasting. It's extremely powerful, for instance because it allows for additional predictor variables. The help file comes with a nice example on forecasting a single cash flow time series

      http://help.sap.com/businessobject/product_guides/pa23/en/pa23_ts_user_en.pdf (just search for CashFlows.txt)

      I hope to publish a small tutorial soon that explains how to use that Automated engine to forecast multiple time series at once.

      Many Greetings

      Andreas

      Author's profile photo Former Member
      Former Member

      Hi Andreas

      I tried the Automated Analytics Time series too. Actually the best part of your component which I like is that hierarchy based aggregation and doing the sorting in the code itself after which the predictive algorithm is applied. I don't think in Automated Analytics version its possible. The manual intervention for data preparation is very much required in Automated unlike this custom component.

      That's why I wanted to use this component. But error margin and accuracy is very much needed before implementing it. Any tip on the custom R component part to do it!

      Regards

      Ranajay

      Author's profile photo Andreas Forster
      Andreas Forster
      Blog Post Author

      Hello Ranajay,

      In case you have only one time series but multiple values per date, you could use this aggregation component to help prepare the data for the time series forecasting in Automated  Mode. Just aggregate by date, probably with the sum of your measure.

      http://scn.sap.com/docs/DOC-43445

      For Automated Mode the data has to be sorted in ascending order (latest information at the bottom).  I am not sure if the above component will sort the data, but that could be an easy manual step. (or another custom r component ;-). In case you are interested in creating your own custom r extension, here is an overview page to get you started

      http://scn.sap.com/docs/DOC-62119

      Many Greetings

      Andreas