Technical Articles
Regression Problem using Apple`s Machine Learning framework
Introduction
In this blog post, we will solve a regression problem using Apple`s machine learning framework called as Create ML. Like every field, machine learning also has various levels of abstraction. It basically depends on which level, one is required to work at. For seasoned iOS developers, Apple provides task based approaches to enable machine learning flavour in their apps. The task based level in my view is the highest of the abstraction levels. It means one does not to be an expert in machine learning to do routine machine learning tasks in an iOS app.
Apple provides two state of the art methodologies – Create ML and Turi Create. Using Transfer Learning approach, we can build an accurate custom model with a small dataset. The base models for both of these have been trained on humongous data. For Create ML, everything happens in Xcode by using Swift language. Turi Create uses Python and we have to make use of environments such as Jupyter notebook etc. to train the model and then convert to coreml format to use with in the app. Also to note that Create ML requires MacOS where as Turi Create is cross platform.
Create ML is basically a subset of Turi Create. For this tutorial, we will restrict ourselves only to Create ML. Everything will be done in Swift language.
Problem Statement
In this blog post, we want to find out the compressive strength of concrete which is one of the good indicators of the quality of the concrete. The measure of unit of compressive strength is Mega Pascal. On a side note, I would also like to mention that one should not get confused between cement and concrete. Cement is in the powder form and concrete is mixture of cement, coarse aggregates and water in a liquid like form. This can be a nice use case as we know that concrete is a vital material in civil engineering and also acts as one of the widely used construction material.
Pre-requisites
You will require a MacOS with Xcode installed in it.
Data Source
We will be making use of UC Irvine Machine Learning Repository – Concrete Compressive Strength Dataset of UCI
The dataset has 1030 rows and 9 columns. The last column “csMPa” is the target variable which we want to predict.
Dataset
Getting Started
First download the data which is in the form of csv from the above mentioned UCI repository and keep it in a folder.
Open the Xcode and then go to File -> New -> Playground
Create Playground file
Select Blank and MacOS and then click on Next
Select Blank and MacOS
Give the playground file some name
Give name to Playground file
Import these two frameworks
import CreateML
import Foundation
Read the csv file you have already downloaded. Give the appropriate path.
Create ML has a special data structure called as MLDataTable that we use to represent tabular data. For ones who are familiar with Python`s Pandas can think of MLDataTable in the same way. However, MLDataTable is not that much flexible as Pandas.
let dataFile = URL(fileURLWithPath: "/YourPath/Dataset/Concrete_Data_Yeh 2.csv")
let data = try MLDataTable(contentsOf: dataFile)
print(data.size)
Run the code at this moment and it should show you the no of rows and columns in the dataset in console
(rows: 1030, columns: 9)
Split the data into train and test
let (trainData, testData) = data.randomSplit(by:0.8, seed: 0)
Lets train the model now. When we create MLRegressor, it inspects our data and automatically chooses a specific regressor. The supported regressors are linear, decision tree, boosted tree and random forest.
let model = try MLRegressor(trainingData: trainData,targetColumn: "csMPa")
Run the code so far and you should see the following output in console
Console output
You can observe that as the number of iterations increase, the root-mean-square error and max error decrease.
Also you can print out the metrics as follows :
print("Training Metrics\n", model.trainingMetrics)
print("Validation Metrics\n", model.validationMetrics)
Now lets save the trained model to any local directory of your wish. Make sure you give the path as appropriate.
let savedModel = MLModelMetadata(author: "Priyanshu Srivastava", shortDescription: "Model to predict compressive strength", license: "Personal Use", version: "1.0")
try model.write(to: URL(fileURLWithPath: "/YourPath/CompressiveStrength.mlmodel"), metadata: savedModel)
Now you will see a model that got created in the folder path you gave above
Model saved to local directory
The complete code so far in the playground file should look as follows
import CreateML
import Foundation
let dataFile = URL(fileURLWithPath: "YourPath/Concrete_Data_Yeh 2.csv")
let data = try MLDataTable(contentsOf: dataFile)
print(data.size)
let (trainData, testData) = data.randomSplit(by:0.8, seed: 0)
let model = try MLRegressor(trainingData: trainData,targetColumn: "csMPa")
print("Training Metrics\n", model.trainingMetrics)
print("Validation Metrics\n", model.validationMetrics)
let savedModel = MLModelMetadata(author: "Priyanshu Srivastava", shortDescription: "Model to predict compressive strength", license: "Personal Use", version: "1.0")
try model.write(to: URL(fileURLWithPath: "YourPath/CompressiveStrength.mlmodel"), metadata: savedModel)
Now lets create a Xcode project and add the above trained model in it.
Click on File -> New -> Project -> Single View App -> Next -> Give a name
Now add your trained ML model into your newly created Xcode project.
To do so click on File -> Add files
Add .mlmodel file
Make sure it matches to below screenshot while adding the model.
Copy .mlmodel
Now click on the .mlmodel file and then click on the icon shown.
.mlmodel
This will land you to the following page where you could see the auto generated Swift file to interact with our trained model
Auto generated Swift file for .mlmodel
Now go to your ViewController.swift and replace the existing code with the following code. CompressiveStrength is the name that I gave to my ml model. It will be different for you based on what name you give to it.
import UIKit
import CoreML
class ViewController: UIViewController {
override func viewDidLoad() {
super.viewDidLoad()
// Do any additional setup after loading the view.
let mlModel = CompressiveStrength()
let prediction = try?mlModel.prediction(cement: 540, slag: 0, flyash: 0, water: 162, superplasticizer: 2.5, coarseaggregate: 1040, fineaggregate: 676, age: 28)
print("Compressive strength is: " + "\(String(describing: prediction!.csMPa))")
}
}
When you run the code, you will get the Compressive strength as follows
Compressive strength is: 67.05448651313782
Conclusion
If we compare it with the ground truth, we will observe that we are quite close. We trained the model on a very limited data which can be one crucial scope of improvement. The common saying in the machine learning world is more the quality training data you have, more better the output prediction you will get.
The big advantage here is that the trained model resides within your app and there is no Internet dependency required to do the inference. You can go ahead and build an UI around this.
Very useful blog for developers who want to explore different possibilities with core ml.
Thanks Jay!