SAP Leonardo ML Foundation – Bring Your Own Model (BYOM)
To continue the entire blog series about SAP Leonardo Foundation:
1 | Getting started with SAP Leonard ML Foundation on SAP Cloud Platform Cloud Foundry |
2 | SAP Leonardo ML Foundation – Retraining part1 |
3 | SAP Leonardo ML Foundation – Retraining part2 |
4 | SAP Leonardo ML Foundation – Bring your own model (this blog) |
In this blog we want now coverin one of the interesting parts for me.
How to bring your own model up an running with SAP Leonardo ML Foundation.
In detail we will execute the following:
- Adding a Tensorflow model to SAP Leonardo ML Foundation
- Deploy the model
- Create the python app an run it
- a) local,
- b) on cloud foundry
Prerequisite
As documented here we need:
- The Cloud Foundry CLI (link)
- Important: The Postman “native” app (not the chrome plugin)
- Python, version 2 >=2.7.9 or Python 3 >=3.4
Additional i’am using Docker to to run my “local” app in a container.
Download the Tensorflow Model Archive
The first thing is an Tensorflow as described in the docu, the model supports currently Tensorflow 1.3 (EU10).
Futhermore we expect that the model is exported in the “Saved_Model format”.
We are using here as well as in the official docu the Google inception model.
A suitable version which works u can get from here: https://s3-ap-southeast-1.amazonaws.com/i305731/teched/inception.zip
Add the model to the repository
Execute a new request with the “Postman” nativ application.
HTTP Method | POST |
URL | <MODEL_REPO_URL> |
Path | api/v1/models/sap_community_example/versions |
HEADER | Authorization (OAuth2 Access Token) |
Body (form-data):
file | inception.zip |
Response:
{
"id": "<id>",
"uploadDate": "16 Mar 2018 08:36:36 UTC",
"modelStatus": "INACTIVE",
"fileName": "inception.zip",
"metaData": null,
"trainingInfo": null,
"version": "5",
"checkSum": null,
"namespace": "<namespace>",
"modelName": "sap_community_example"
}
Deploy your model
Next we deploy the model.
HTTP Method | POST |
URL | <MODEL_REPO_URL> |
Path | api/v1/deployments |
HEADER | Authorization (OAuth2 Access Token) |
Body (JSON):
{"modelName": "sap_community_example"}
Check the deployment
Afterwards (after some minutes) we can check the deployment and get the required information.
HTTP Method | GET |
URL | <MODEL_REPO_URL> |
Path | api/v1/deployments |
HEADER | Authorization (OAuth2 Access Token) |
Body (JSON):
{
"id": "da776432-e468-4ce0-822d-d21dcc0ab559",
"namespace": "8dd19b31-7dc5-4620-be78-8d5ebbfa5d61",
"modelName": "sap_community_example",
"modelVersion": "2",
"placeholderName": "sap_community_example",
"resourcePlan": "model-container-cpu-tfs",
"deploymentStatus": {
"state": "SUCCEEDED",
"description": "Service [tfs-c74e4047-ef93-4b93-9b0e-2aa5a923d23e] is ready."
},
"modelContainer": {
"host": "<host>.eu-central-1.aws.ml.hana.ondemand.com",
"port": "80"
}
}
],
"caCrt": "<certificate>",
"last": true,
"totalElements": 3,
"totalPages": 1,
"sort": null,
"first": true,
"numberOfElements": 3,
"size": 10,
"number": 0
}
We need now from the response the following data:
- host
- port
- caCRT
Create the application and execute the API
A complete guide how to create the application is already documented in the SAP help.
Additional u can find it in my git repository
After we´ve build our app wen can run this by entering:
> docker run -p 5000:5000 sap_ml
* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
To verify if the API works, we can now execute create an new Postman request:
HTTP Method | POST |
URL | http://localhost:5000 |
Body (form-data):
file | <e.g. a image from a cat or dog> |
Response:
It looks like it works…..
Let´s provide now a “real” API endpoint on SCP Cloud Foundry.
This API can the e.g. used from other applications.
Push you application to SAP CP Cloud Foundry
First of all we need to creat the follwing structure (git link):
In difference to the local api before we have now a new file called:
Procfile:
web: python app.py
And the deployment descriptor where we need to define the application details:
manifest.yml:
applications:
- name: <you appname>
host: <your app host>
memory: 512M
timeout: 30
buildpack: python_buildpack
env:
MODEL_NAME: <model name>
MODEL_SERVER_HOST: <deployed model host>
MODEL_SERVER_PORT: 80
ROOT_CERT: '<caCert>'
Afterwards we can now push this app to cf:
> cf push -f manifest.yml
looks fine after a minute or two:
And if we now recheck this app on the SCP CF cockpit, we see already a successfully started application:
The las step today is now to execute this API and we should get hopefully the same result:
Perfect…..it works ;o)
Conclusion
I think and hope u can see, the SAP Leonard ML Foundation BYOM functionality can be really easy.
For me ist was sometimes hard, because the docu was in the beginning not perfect.
In the meantime this has been improved (step by step) and hopefully this process should be continue to provide a central point of information for this kind of new fancy stuff ;o)
Comments and questions are welcome.
cheers,
fabian
Thanks for the detailed steps.
I think small change is required when adding the model to repository:
PATH is
/api/v2/models/{modelName}/versions
To be more precise, you can open the the MODEL_REPO_URL, that takes you to a swagger UI, which provides Model Versions Controller ,where we can directly create model version by providing the zip content comprising of .pb file and Access Token
Hi,
My post URL : /api/v2/models/zwzmodel/versions
And the respond is:
<html><body><h1>Service Unavailable</h1>No server is available to handle request for this tenant , or the application is temporarily down for maintenance. Excuse us for the inconvenience.</body></html>
Do the API is available always ?
Hi,
When i execute cf push -f manifest.yml
i am getting the following error
FAILED
Server error, status code: 400, error code: 310006, message: You have exceeded the total routes for your organization's quota.
I have a GLOBAL Account and if i check my ORG details i see in the result that quota variable assigned to a value SUBSCRITION_QUOTA
Can you help me how to over come the above error during cf push?
Hi,
Process worked for me till step 14 of the BYOM guide.
When deploying, am getting this error - please can you suggest on the process to onboard the tenant?
Thanks
Sudarshan
Hello,
firstly I would like to thank you for your great work in this blog.
I did the described steps myself and some errors occurred. I am able to upload my model (and also to update it). But when I try to deploy my model an error occur:
My Post-Request: https://<myProductiveEnvironment>.cfapps.eu10.hana.ondemand.com/api/v2/deployments/
Body:
{"modelName": "myModel"}
Does anyone have a solution to my problem?
Hi René,
I just started following this blog post and stumbled over the same issue as you. Unfortunately, I see no answers to this one yet.
By chance, did you find any solution so far?
Thanks & best regards
Matthias
I am facing this same issue and getting the below message as return -
Can you please provide any input for this issue.
Hi all,
i´ve just tried out the v2 version to byom u can find an short how to here.
Futhermore i will update thsi blog asap.
best,
fabian
Hi fabian, thanks for all the posts they have served me a lot. I have a problem with the OCR service.
According to Leonardo's documentation, the OCR service is not possible to re-train. Recommended some model (OCR) to upload. Leonardo's current service is not very effective.
Thanks!
Hi! for me everything works perfect. but my python app does not work;
TypeError: beta_create_PredictionService_stub() got an unexpected keyword argument 'metadata_transformer'
has someone the same error? What could be the error?
When I remove the metadata_transformer from the signature then the call is ok, but I fails because of authorization
Hi Sandro,
I also faced this issue, it’s because they have changed the signature of the beta_create_PredictionService_stub() function.
In the latest release of tensorflow-serving-api package, it takes only one argument i.e. chanel.
I’m trying to figure out how to create a secure authorized channel with latest version.
For time being you can install older version:
Hi!
That would be great to figure this out! I changed my api Version mto 1.6.0 an getting this error:
AbortionError: AbortionError(code=StatusCode.FAILED_PRECONDITION, details="Default serving signature key not found.")
did you had the same error?
Best regards
Sandro
I've done something like this:
Hi Fabian Lehmann,
I am getting the below error while deploying my model:
POSTMAN what I am using is below:
What is the change I need to do for the output?
Does anyone have a solution to my problem?
Hi
, my deployment status is showing failed. when your steps followed
I am also getting the same error.
Can you help me...how i can use retrained model on SAP WEB UI.
Hi
thanks for this blog post, it helped me a lot!
I developed my own simple model which works locally without any issues. So, I deployed it successfully at SAP Leonardo and it seems like I can connect to it from my local python app with that updated prediction_stub signature (tf serving 1.12).
My problem now is that I always get an error about the binary versions from server:
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with:
status = StatusCode.INTERNAL
details = "invalid header field value "NodeDef mentions attr 'T' not in Op<name=Where; signature=input:bool -> index:int64>; NodeDef: linear/linear_model/x/to_sparse_input/indices = Where[T=DT_BOOL, _output_shapes=[[?,2]], _device=\"/job:localhost/replica:0/task:0/cpu:0\"](linear/linear_model/x/to_sparse_input/NotEqual). (Check whether your GraphDef-interpreting binary is up to date with your GraphDef-generating binary.).\n\t [[Node: linear/linear_model/x/to_sparse_input/indices = Where[T=DT_BOOL, _output_shapes=[[?,2]], _device=\"/job:localhost/replica:0/task:0/cpu:0\"](linear/linear_model/x/to_sparse_input/NotEqual)]]""
All infos I got is that it seems to be a versioning issue of tensorflow?
I am using Tensorflow version 1.8 (also tested it with 1.7 and newest) with Python 3.6 which is noted at SAP Help. Do you have any idea what's really happening?
Thanks in advance!
Dorothee
I could not run at localhost. The postman returns 500 :/
Hello,
I am also getting error upload to CF. I guess it’s related to tf_serving version. Is it possible to handle version management?
HI, own model is only restricted to image classifier, or all kinds of ML model is supported for consuming?