From R to Custom PA Component Part 6
This is part 6 of the blog series, this previous blogs can be found here.
In this blog I’m going to focus on how we create our own component for Predictive Analytics. The component we are creating will be a time series component. Predictive Analytics already has this time series components, we will however be creating this for us to learn and also compare.
Before we start creating our own component, we will create a predictive model on the same data as our custom component. We can then compare our custom component with the standard components.
We will create a simple component as shown below. The source data is a dow_jones_index.csv file that is attached to the blog. This data came from http://archive.ics.uci.edu/ml/datasets.html
I have edited the data slightly. The data basically has information of specific stocks, there open price, etc.
The file has many stock items. We will analyse a specific stock. So ensure to filter and only look at the stock ‘PFE’
Ensure the R-Single Exponential Smoothing has the below settings. Note we using custom periods, this is due to the data not being for the whole year in the file, also has the data in weeks. So for this example we going to analyse 30 weeks, we have data for 25 weeks. So we will predict for 5 weeks.
Once we run the model we should have the below diagram and supporting data.
Create R Code for Custom Component
We will now create our custom component and hopefully get similar results. For us to stat creating the component it is best t open R, in order to test our data we need a function, we need a part that will read the data and call the function. This is just for us to test, once we ready to move into Predictive Analytics we will only copy the function.
We will now create a variable that is a data frame that will hold values from the file, excepts we will be filtering by stock == PFE. Also note the return statement now returns the filtered data.
We have now added two more lines. The one line we calculate the mean and store in a variable named datafilter.mean, we do this by making use of HoltWinters library. We then predict for 5 periods.
You will start seeing the values in our R that corresponds to our already created model. For example below you can see the alpha value and confidence level. You will also see the periods to predict is 5 as we did in the model also.
We then show the values on a graph. In R it is a plot.
We should already start to see similarities between our component and the Predictive Analytics model. If we look below we see they don’t look similar. But they are, the graphs look different cause in R our y axis goes up until 21, also the increments are 0.5 where is the Predictive Analytics model y axis goes up to 25.
So by defining the y axis we can now see the graphs will match.
We can now also add two more lines of code that will also indicate an optimistic value and pessimistic value. In other words the highest value one can expect, then the lowest value one can expect.
I have now added several lines of codes to make new data sets with less columns, also combine forecast (predict) values with original actuals. We need to return a single list to show in Predictive Analytics. Also added date values to predicted values.
The R code will now return this new formatted list.
Create Custom Predictive Component
We are now ready to take our component and place it into Predictive Analytics.Go to the below location to create the component.
The below will be displayed. Enter the component name. Click next.
You then place the R Code in the Script Editor. Only place the code from the function. Do not place the part that calls the file and calls the function as we will do this in predictive analytics.
We then list the columns and the data types for each column that we expect the function to return. We can click finish.
Under our algorithms, at the end you can see your custom one. We can now model with it. I have not used the filter as we have the filter in the component.
We can now run the model and see the results in Predictive Analytics
The data we returned.
Hope the above helped. Part 7 will focus on making the above more dynamic. Will be posted soon.
Find more info on twitter @louisdegouveia