“Can I run today?” Or a rapid simple-perceptron implementation in HANA.
Let’s talk a little bit about artificial neural networks (ANN). In the world of the ANN, the simple-perceptron (SP) is the most basic unit for information processing. It’s is built in such a way that responds to input data, just like a biological neuron will. The weighted sum of the inputs, passed through an activation function defines the response.
The main task of a SP is to classify. It can makes the distinction between two kinds of objects. After learning with a training set, the SP can do a binary classification, i.e., given an new object, the SP determines the class where the new object belongs. The training set is formed with some elements of each class. The learning of the SP consists in an algorithm that tunes some parameters called weights. This tuning process is made in function of the learning error. To a bigger error, the bigger weights correction. This is an iterative process, and it finishes when the committed error is lower than a given tolerance.
There is a convergence criterion. When the training set is linearly separable, the learning algorithm will converge, and the learning error will be zero. What does this means? In order to visualize it, let’s pretend that we are working in only two dimensions. Thus, the training set is formed with some points in the cartesian space. Then, the training set is linearly separable only when we are capable to draw a line such that the points of one class are only on the left (or on the right) of that line.
So, where are we going with all this stuff? We want to train a SP to decide if we can go out of home to make some exercise, given some environmental conditions. The training set will be constructed with some historical data. In our particular case, we are taking the historical from the Secretaría de Medio Ambiente del DF. (Ministry of Environment of Mexico City).
The classification will be held in function of the IMECA index. Each element of the training set will have the next information:
[zone, imeca(O3), imeca(SO2), imeca(NO2), imeca(CO), imeca(PM10)]
The information that is going to be used for training, corresponds to the whole 2012 year. The associated (and processed) CSV is located here. We need to declare three tables. One for the training set, one more for storing the weights, and finally one for the final weight’s version. TRAIN, W, PARAM respectively. The only two tables that you need to initialize are W and PARAM.
CREATE COLUMN TABLE “CARLOS”.”W” ( “ITER” INTEGER NULL, “W1” DECIMAL (10,4) NULL, “W2” DECIMAL (10,4) NULL, “W3” DECIMAL (10,4) NULL, “W4” DECIMAL (10,4) NULL, “W5” DECIMAL (10,4) NULL, “W6” DECIMAL (10,4) NULL, “B” DECIMAL (10,4) NULL);
INSERT INTO “CARLOS”.”W” VALUES(1,0,0,0,0,0,0,0);
CREATE COLUMN TABLE “CARLOS”.”PARAM” ( “W1” DECIMAL (10,4) NULL, “W2” DECIMAL (10,4) NULL, “W3” DECIMAL (10,4) NULL, “W4” DECIMAL (10,4) NULL, “W5” DECIMAL (10,4) NULL, “W6” DECIMAL (10,4) NULL, “B” DECIMAL (10,4) NULL);
INSERT INTO “CARLOS”.”PARAM” VALUES(0,0,0,0,0,0,0);
You can get the learning algorithm from here. So, you can, for example, call the training procedure with this parameters:
where “1000” stands for a maximum number of iterations, in case there is no convergence; “0.01” is the error tolerance, and “0.2” sets a learning speed.
The execution might take some time, depending on the parameters you try. When the procedure execution ends, now we can predict, or we can ask to the perceptron if, given some conditions, we can go out to make some exercise. The way we can do this is executing another SQLScript procedure, that takes the PARAM values and evaluates the activation function.
CALL “_SYS_BIC”.”perceptron.simplePerceptron.scripts/testIMECA”(z, x1, x2, x3, x4, x5, x6, ?);
If the answer is 1, we can go out.