Forecasting Time Series in COVID-19 Days: Handling Minor Impact Scenarios – Part II
In this intro blog, we explained how the spread of COVID-19 impacts the ability of businesses across the world to plan and predict their future. Through that blog, three different scenarios are described: minor, lasting, and major.
Concrete examples of how these scenarios can be handled today in SAP Analytics Cloud will be covered through this blog series. For today’s episode, I’ll talk about the minor impact scenario, as described in the intro blog, for Time Series Forecasting in BI.
With the spread of COVID-19, many businesses and industries have been affected by and large. This is evident within their most recent business reports and underlying data. For some very fortunate businesses, the impact on their sales and other KPIs will be minor, i.e. the impact on their business is limited to a short time period, after which their sales will go back to the pattern prior to the pandemic. With this limited impact during a well-defined period, the assumption is that business goes back “as usual” after a few months by returning to the previous state prior to the impact.
In these situations, when using time series forecasting methods in order to predict future sales or other KPIs, one of the potential approaches in dealing with this temporary impact is to filter out the months or quarters of the impacted period. The filtering happens before training the time series model.
There are multiple ways to remove the corresponding period from the underlying data foundation using SAP Analytics Cloud. In this blog, I will walk us through how to filter out the impacted period from the underlying data foundation.
Case Study using Historical Data
Our first example will include historical data based on the evolution of flights for the Los Angeles International Airport (LAX) in the United States during a time of major disruption – the September 11 attacks in 2001. The time period of this historical dataset is from January 1993 up to April 2008, but we will only use data up to December 2003 for this demo. The original data source is the US Department of Transportation, more precisely as the Bureau of Transportation Statistics. Please note that the base data is expressed in million flights per year.
To begin, we will create a baseline chart to help us distinguish the differences between an automated forecast with unfiltered data versus an automated forecast with filtered data for impacted periods.
What is Automated Forecasting?
Before going into an example, let’s briefly review some background on automated forecasting. As described in this blog, our business users can really benefit from automated forecasting due to the automatic selection of the best time series model for their data. Instead of having to try out different models, including taking care of trend and seasonality, the algorithm does the work for the business users and automatically selects the best model.
Creating a Baseline chart with an Automated Forecast using Unfiltered Data
First, we will create a new story with a time series chart to visualize this data (Figure 1).
We will use the raw data as is and predict for the entire year of 2004 and 2005. To add Automatic Forecasting, click on the in-chart menu and select “Add”. Click on “Forecast” and select “Automatic Forecast”. We will ask for forecasted periods to account for 2004 and 2005. SAP Analytics Cloud utilizes December 2003 as the last data point prior to the automated forecast results.
As you can see in Figure 2, the forecasted values go until December 2005.
Here, the impacted period was not removed prior to creating the automated forecast. On the other hand, the impacted period is between September 2001 to December 2002 for this historical case study. We can see that the quality of the forecast has suffered as the prediction is based on a linear trend and no longer accommodates for the seasonality found in the pre-crisis data.
In the next part, I will show how to deal with such situation by filtering out the months or quarters within the impacted period from the underlying data foundation.
Creating an Automated Forecast using Filtered Data
Let’s start again with the time series chart including the unfiltered data in Figure 1. We will now utilize in-chart filtering to remove the impacted period from being considered as input to the time series forecasting algorithm.
To do this, we will navigate over to the Designer panel and select “Add Filters” in the Filter section. Then, select “Date (Member)”.
In the pop up that appears, select the values to exclude and toggle on the “Exclude selected members” option. Then, click on the “OK” button.
As you can see in Figure 3 below, the time series chart shows data until December 2003. As well, the data values between September 2001 to December 2002 are no longer present in the chart due to being filtered out, as indicated by the straight line.
Now, we are ready to add Automatic Forecasting. Click on the in-chart menu and select “Add”. Click on “Forecast” and select “Automatic Forecast”. We will again ask for forecasts to account for the entire years of 2004 and 2005.
As you can see, the forecasted values go until December 2005.
By filtering out the months or quarters within the impacted period, that data is removed from the underlying data foundation and is not part of the new automated forecast. The forecast now reflects the seasonality and trend being present in the period before the event. Let’s now compare the two scenarios.
Comparison of Figure 1 and Figure 3: Historical data with and without impacted period
Comparison of Figure 2 and Figure 4: Forecast with and without impacted period
The forecast in Figure 2, which is created prior to filtering out the impacted period, shows a linear forecast. Thus, the quality of the forecast has suffered as the prediction is based on a linear trend and no longer accommodates for the seasonality found in the pre-crisis data. Once the impacted period has been filtered out, the forecast in Figure 4 reflects the seasonality being present.
Thanks to my colleagues Antoine Chabert, Irene Chung and Yann le Biannic for their collaboration on this blog series.