Technology Blogs by Members
Explore a vibrant mix of technical expertise, industry insights, and tech buzz in member blogs covering SAP products, technology, and events. Get in the mix!
cancel
Showing results for 
Search instead for 
Did you mean: 
fabianl
Product and Topic Expert
Product and Topic Expert

Introducing


To continue the story from the last blog, were we get started how to get access to SAP Leonardo ML Foundation. And which steps are requierde to get allowed to call the API´s.

 

SAP Leonardo ML Foundation Architecture:



 

We want now focused on the upcomming lines to check and execute the retraining for the "image" callssifier with our own data.

I want focus on this blog the doing and not on ML in general. And futhermore u can use the "retraining" functionality not with a trial version!

 

Important: Currently only the "Image Classifier Service" can be used for the retraining.

 

Let´s start......

 



In general the retraining consists of the follwing four steps:

  1. Uploading the data for the training

  2. Executing the retraining job

  3. Deploy the model

  4. Execute the image classifier API


 

In detail we want

 

Please check pls also the SAP Help documentation.

 

Data, data, data


The first thing what wee need is for sure some data as our source which we want to use to train our new model.

Based on the fact, that hopefully the spring is not far away we just using some nice flower data ;o)

Another (the real) reason is that "Tensorflow" provides an archive for that and we want to start simple.

But anyway another good resouce to get other pictures is of course the Image Net or the Faktun Batch Download Picture plugin for chrome



As mentioned before we just starting by download flower archive from here to our local device.



A part of this data will be used later for our own "flower" model with the SAP Leonardo ML Image Classification service.

Get started and check the API


A good starting point is simply to enter the "retraining url" in a browser and have alook at Swagger UI to get an first idea which options we have:



In general we have three main parts for the retraining:

  • jobs

  • deloyments

  • models


 

Data preperation


Before we can execute one of the API´s we need to prepare our data and uplpoad them to AWS.

To start simple i´ve decided to reduce the amount of the data which comes with the archive which is provided by tensorflow. I think thre categories of flowers works.

For this create the following data structure:
+-- flowers
+-- training
+--roses
+--sunflowers
+--tulips
+-- test
+--roses
+--sunflowers
+--tulips
+-- validation
+--roses
+--sunflowers
+--tulips

As documented we need to structure the 3 folders "training", "test" and "vaidation".

Furhermore we split our source data into a 80-10-10 (~80% training, ~10% test and ~10 % validation).


Access the AWS object store


To get access to the object storage which runs on Amazon Webservice (AWS) we can using "minio" to operate directly with the S3 objectstore.

You can get the minio client here: link

Additional we can access the data also via UI.

For this and also the CLI access we need first to initialize (needs to be done only once) our file system by executing the follwing API call:




















HTTP Method GET
URL <JOB_SUBMISSION_API_URL>
PATH /v1/storage/endpoint
HEADER Authorization (OAuth2 Access Token)


As response we get now something like this:
{
"access_key": "<access key>",
"endpoint": "<endpoint>.files.eu-central-1.aws.ml.hana.ondemand.com",
"message": "The endpoint is ready to use.",
"secret_key": "<secret key>",
"status": "Ready"
}


The Minio UI

To get acces to the s3 store via the minio ui enter the URL and logon via the "acces key" and the "secret key":



Afterwards we are able to see our bucket (data) with some data:


The CLI access

For the access via the CLI, we just starting here again with the authentification:
>mc.exe config host add saps3 https://<your endon aws s3>.files.eu-central-1.aws.ml.hana.ondemand.com <access key> <secret key>
Added `saps3` successfully.

And afterwards we now can using the "mc" command to e.g. list our data (buckets):
mc.exe ls <bucket>/<directory>



Update: Using "cyberduck"

Additional to the previous tools u can also use "cyberduck" to connect to your AWS S3 filesystem.

Creat a new AWS S3 connection by entering the required data:



As result u can access the data here:


Upload our data


Now its time to upload our "custom" data which we wan´t to use for our "retraining".

The easiest way is to copy our files by executing the cp command:
mc.exe cp -r E:\0_SAPCP\8_ML\1_SAP_ML\0_Development\1_first_try\flowers saps3\data
...bc557236c7_n.jpg: 146.19 MB / 146.19 MB [================================================] 100.00% 484.60 KB/s 5m8s

Aferwards we can see our uploadad data on our AWS S3 bucket:



 

In the case something is going wrong u can also use the following command to delete your data / bucket:
mc.exe rm --recursive --dangerous --force saps3/data

A complete overview about all commands can be found by executing the "--help" parameter.
mc.exe --help

 

Time for the retraining....execute the job


As result that our data is know in place to exetute our training wen can now call the corresponding API:



Details:




















HTTP Method  POST
URL <RETRAIN_API_URL>
PATH /v1/jobs
HEADER Authorization (OAuth2 Access Token)


And the following Body:
{
"mode": "image",
"options": {
"dataset": "flowers",
"modelName": "flowers-demo"
}
}

As response we get now the "job id":
{
"id": "flowers-2018-02-15t0851z"
}

 

By executing the correspomding GET method we can retrieve the details and the status about the all "jobs":



or only our new job:



We get something like this response:
{
"processedTime": "2018-02-15T08:54:31.541131",
"status": {
"startTime": null,
"submissionTime": null,
"id": "flowers-2018-02-15t0851z",
"finishTime": null,
"status": "Pending/Scheduled"
}
}

{
"processedTime": "2018-02-15T08:57:34.844304",
"status": {
"startTime": "2018-02-15T08:57:33Z",
"submissionTime": "2018-02-15T08:55:36Z",
"id": "flowers-2018-02-15t0851z",
"finishTime": null,
"status": "Running"
}
}

And finally u can see i took a while:
{
"processedTime": "2018-02-15T09:03:15.181445",
"status": {
"submissionTime": "2018-02-15T08:55:36Z",
"id": "flowers-2018-02-15t0851z",
"startTime": "2018-02-15T08:57:33Z",
"finishTime": "2018-02-15T09:02:32Z",
"status": "Succeeded"
}
}

 

Lets check the log´s


Before we start with the final deplyoment, we start we a short look at our AWS S3 filesystem.

And there we can now see some additional folders:
>mc.exe ls saps3/data/
[2018-02-15 10:04:34 CET] 0B flowers-2018-02-15t0851z\
[2018-02-15 10:04:34 CET] 0B flowers\
[2018-02-15 10:04:34 CET] 0B jobs\

If we now display the content of our "job id" folder.
mc.exe ls -r saps3/data/flowers-2018-02-15t0744z
[2018-02-15 10:02:31 CET] 12KiB retraining.log

And futhermore if we have a deeper look at the log file we get the information about the retraining:
mc.exe cat saps3/data/flowers-2018-02-15t0851z\retraining.log

Scanning dataset flowers ...
Dataset used: flowers
Dataset has labels: ['roses', 'sunflowers', 'tulips']
2228 images are used for training
180 images are used for validation
200 images are used for test
********** Summary for epoch: 0 **********
2018-02-15 09:00:08: Step 0: Train accuracy = 87.5%%
2018-02-15 09:00:08: Step 0: Cross entropy = 0.451392
2018-02-15 09:00:09: Step 0: Validation accuracy = 86.1%% (N=180)
2018-02-15 09:00:09: Step 0: Validation cross entropy = 0.437444
Saving intermediate result.
********** Summary for epoch: 1 **********
2018-02-15 09:00:13: Step 1: Train accuracy = 93.8%%
2018-02-15 09:00:13: Step 1: Cross entropy = 0.291782
2018-02-15 09:00:13: Step 1: Validation accuracy = 92.2%% (N=180)
2018-02-15 09:00:13: Step 1: Validation cross entropy = 0.320360
Saving intermediate result.
.....


At the end of this file we get the "Summary" about our training:
##########################################
########### Retraining Summary ###########
##########################################
Job id: flowers-2018-02-15t0851z
Training batch size : 64
Learning rate : 0.001000
Total retraining epochs : 100
Retraining is stopped after 10 consecutive epochs which show no improvement in accurracy.
Epoch with best accuracy : 27
Best validation accuracy : 1.000000
Final test accuracy is : 0.985000
The exported model will predict top 3 classifications
Retraining started at: 2018-02-15 08:57:34
Retraining ended at: 2018-02-15 09:01:59
Restoring parameters from /home/model/interval-model-27
No assets to save.
No assets to write.
SavedModel written to: /home/model/tfs/saved_model.pb
TF Serving model saved.
Retraining lasted: 0:04:25.357850
Model is uploaded to repository with name flowers-demo and version 3.

 

A short explanation to the "Epoch" and "Bacth Size" terminology is here described: link
Epoch: One Epoch is when an ENTIRE dataset is passed forward and backward through the neural network only ONCE

Batch Size: Total number of training examples present in a single batch.

 

In the next blog we will continue the retraining by deploying the model and finally testing and executing our "new" model by adapting the standard "Image Classifier" API.

 

cheers,

fabian

 

Helpful Links






SAP Leonardo ML Foundation: https://help.sap.com/viewer/product/SAP_LEONARDO_MACHINE_LEARNING_FOUNDATION/1.0/en-US

Tensorflow flowers dataset: http://download.tensorflow.org/example_images/flower_photos.tgz

Minio Client: https://docs.minio.io/docs/minio-client-quickstart-guide

Tensorflow: https://www.tensorflow.org

Image net: http://image-net.org

Faktun Batch Downlaod Image: https://chrome.google.com/webstore/detail/fatkun-batch-download-ima/nnjjahlikiabnchcpehcpkdeckfgnohf...

Epoch vs Batch Size vs Iterations: https://towardsdatascience.com/epoch-vs-iterations-vs-batch-size-4dfb9c7ce9c9

 

 
22 Comments
Labels in this area