 # From R to Custom PA Component Part 3

Intro

This is part 3 of a series of blogs.

Please refer to part 1 and part 2 that can be found here

From R to Custom PA Component Part 2

In this blog I will be focusing on more intermediate syntax for a beginner, for an experienced R developer this will be still considered as basic. You will however still require this knowledge to create a Predictive Analytics component.

I will cover the following

• Vectors
• Matrix
• Data Frames
• R Scripts
• Functions
• Loops
• Graphs

Then we will review the differences in RGui and RStudio

Vectors

In R a vector is basically what we would call an array in other programming languages. A vectors values can be numbers, characters or any other type. But they should all be the same type.

Start by opening RGui again, type straight into the console. To create a function we use the c function which is short for combine. It combines the values to make a vector. Below is an example of two vectors. But the previous example we not storing it in a variable, in reality we want to store everything in a variable. So here I have stored a list of values into X. Then display the content of X. You can also assign a range of values using the below syntax. So below i’m saying variable Y will have a list of values from 5 to 9. We can also access a single value in the vector. Below i’m accessing the value in the vector at position 3. We can append values with the below syntax. We can change one of the values, below we are changing the second value. We can assign names to each value in the vector. You can then access by the name of the column you have assigned in previous steps. All the vector steps shown above can be applied with other data types. Below i’m doing it with text. Matrix

In R a matrix is basically what we would call a two dimensional array in other programming languages.

So here is an example of how to create a matrix. mat is a variable. The function matrix creates the matrix, I have said create it with 3 rows, 3 columns and default the values with 1. You can see the result in the variable mat. We can change the value of a specific item in the matrix, similar syntax as a vector except we must list the column and row. So the below I have said change row 1 column 3 to have the value 5. We can also access all the values by row or by column. Below I have first access by row, showing row 2. Then I access and display column 3. Data Frames

Data frames is similar to a matrix. Except that a data frame can have different types of data for the different columns where a matrix can’t. Also the data frame is more easy to work with, however a matrix is more efficient when coming down to performance.

In the below example I created two vectors. One with employee names, then another with salaries. I then combine them into a data frame. employee.data is a variable. You can then access the data frame the same way as you would access a matrix. R Scripts

Up until now we have entered everything in the console. We have done this to learn the syntax and understand how the console works. But in reality you would not work directly in the console. You would create a R script and enter everything in there.

To create a script go to File->New script You can now add R code to the script. So in this example I created variable X with value 10, created variable Y with value 2, created variable Z with X*Y, I then output Z To execute the lines, highlight them and then right click and select “Run line or selection”. You can also select one line at a time and execute them. It will then execute the commands in the console. Functions

Being able to write functions is important, you will need this to create a custom PA component.

The basic syntax for a function is

myfunction <- function(arg1, arg2, …) {

function body

}

Below is an example of a function. In this function I receive two values, add them together in variable Z and print the Z variable out. So this is a very basic function. Please note that from now on I will always create a script and place the R code in the script. Then execute from the script. The above example of a function is very basic, we would not normally do a function like that. A function we would usually create in a way that we return a value, we will need to return a value when creating a component for Predictive Analytics. So be sure to understand the below example. Loops

We get different types of loops in R. I will cover two of them.

Here is an example of a for loop.  Here i’m assigning X variable with a range of values from 1 to 10. Z variable is created and made to be null. Then in the for loop we are saying the variable i will start from 1 and go up until 10. So the loop will repeat the code between the curly brackets 10 times. Each loop I then get the value in X at the position of the value i. So if the loop is on it’s third cycle, the variable i will have three in it, we saying get the value in variable X at position 3. In this case it is 3 in position 3. But could of been different value. We then add 1 to the value and assign to Z. Each time I replace Z, then print the value. Here is an example of a while loop. The biggest difference would be that a for loop the amount of loops is determined before the loop starts. While the while loop the loop will go until the condition is met, the condition changes while the loops occur. In the below example you can see that in the while loop I increment the i variable and will stop when i <= 10. You will need to be able to read files to work with sets of data that you have in text files.

Below is an example where I use the setwd function to set the directory where the file is. I then read in the data by specifying the file name, when reading in the data I indicate the first row is the header. Then display the data. When reading in the data from the file, due to different data types it creates a data frame. Graphs

There is several options and libraries for graphs.

Going to stick to the basic ones.

Here is an example of a bar graph, called bar plot in R. Basically I create a vector called graphvalues, I then assign names to the vector as we have done previously. These names are used on the x axis of the bar plot. Then I call the function barplot and send the vector values as a parameter. When you execute this the following bar plot will be displayed. .

Using the same as the above, just change barplot(graphvalues) to plot(graphvalues) Using the same as the above, just change plot(graphvalues) to plot.ts(graphvalues) RStudio Differences

Below you can see the differences when doing everything above in RStudio. The script takes up the left top window, the console moves to the bottom. Your plots/graphs can be shown on the bottom right. When working with data frames we can see another difference. Displaying the data in the data frame is easier. You can see the data in an easy to scroll window that shows the data in an excel looking grid. Hope you find this useful. Part 4 can be found here From R to Custom PA Component Part 4