Custom R Component – Add Missing Dates
This component takes an incomplete daily time series and adds any missing dates. The measure can be approximated for the dates that have been added. Ensuring that such time series are complete can be important for charting or forecasting purposes.
Please note that this component is not an official release by SAP and that it is provided as-is without any guarantee or support. Please test the component to ensure it works for your purposes.
– R library zoo must be installed.
– The first column must hold the date stamp as String value.
– All remaining columns must be numerical values that can be approximated.
Please let me know should you encounter any limitations.
These parameters can be set by the user.
Specifies the string column’s date format in R notation. For instance: %d.%m.%Y
See the documentation of the as.Date() function for the syntax.
|Approximate Missing Values||
True or False
Specifies whether the measure of the added dates will be approximated or left empty.
No output columns added by this component.
How to Implement
The component can be downloaded as .spar file from GitHub. Then deploy it as described here. You just need to import it through the option “Import/Model Component”, which you will find by clicking on the plus-sign at the bottom of the list of the available algorithms.
You can try out the component wtih our own data or with the file ZurichTemperaturesWithMissingDates.csv. The dataset lists the maximum temperatures measured in Zurich, Switzerland, on a number of days. The file contains only two columns: the date and the temperature. However, some dates are missing in this dataset and we want to complete the time series by adding those missing dates.
Just start SAP Predictive Analysis and load the file ZurichTemperaturesWithMissingDates.csv. You will see the date and the maximum temperature measured on that day. You can also see, that for instance the 8th and 9th of November 2014 are missing.
In order to fill in the missing rows, go to the “Predict” tab and add the “Add Missing Dates” component to your workflow and configure it. Specify the date format and whether the measures of the added dates are to be approximated. For now keep the default and do not approximate.
Run the component and switch to the result view when prompted. You will see the that the time series is now complete. The missing dates have been added with an empty measure. You could now analyse the data further or write the results to a file or database.
However, we want to fill in the missing measures. In the component’s configuration change the value for “Approximate Missing Values” to True. Execute the component again. You will see how the missing values have now been filled by approximations of the surrounding values. The time series is complete.