Skip to Content

In my previous blog post about image retraining, a question of the data structure for text retraining came up. I have answered this question, but I think it makes also sense to write a blog post about it. So I decided to write a small series with at least one more blog post.

For the text retraining I will use Twitter Sentiment Analysis data which classifies positive and negative sentences. SAP Leonardo trains the Machine Learning Model, which can be deployed and used by the text classification service. Here is an example:

Most of the procedure is similar to the image retraining, therefore I will refer to it and describe the differences in more detail.

What do you need?

Step 1 – SAP Leonardo Machine Learning instance

Take a look at Step 1 of the Image Retraining to create a SCP trial account and a Service Key:

{
  "clientid": "sb-42mn3-3z7p-96r-3c79-0x1pm0l!p216|klgco-lag-vitas!d66",
  …
  "clientsecret": "5ghsdYM/Z567N5LoQ7nrXBkZ0BV=",
  "serviceurls": {
    …
    "TEXT_LINEAR_RETRAIN_API_URL": "https://mlftrial-retrain-text-linear-api.cfapps.eu10.hana.ondemand.com/api/v2/text/retraining",
    …
    "TEXT_CLASSIFIER_URL": "https://mlftrial-text-classifier.cfapps.eu10.hana.ondemand.com/api/v2/text/classification",
    …
  },
  "url": "https://p2000894545trial.authentication.eu10.hana.ondemand.com"
}

Note: Instead of the IMAGE_RETRAIN_API_URL and IMAGE_CLASSIFICATION_URL the TEXT_LINEAR_RETRAIN_API_URL and TEXT_CLASSIFIER_URL are important.

Don’t mess it up like I did. :confounded face:

Step 2 – Storage

Same procedure as Step 2 at the image retraining.

In case you’ve got already a storage, just run the POST again to get the endpoint, accessKey and secretKey.

Step 3 – Training data

The following steps are necessary to create the training data:

The script creates a sentiment_100.zip file with the following structure (see also SAP Help – Uploading Data):

sentiment
├── test
│ ├── negative
│ └── positive
├── training
│ ├── negative
│ └── positive
└── validation
├── negative
└── positive

Step 4 – Upload

Take a look at Step 4 of the image retraining and run this to upload the sentiment data set:

mc cp sentiment_100.zip saps3/data/sentiment

Step 5 – Training

After uploading the training data, start the training with Postman.

Training

URL: TEXT_LINEAR_RETRAIN_API_URL/jobs
Doc: https://api.sap.com/api/text_linear_retrain_api/resource
POST
https://mlftrial-retrain-text-linear-api.cfapps.eu10.hana.ondemand.com/api/v2/text/retraining/jobs

Headers:
  Authorization: {{Bearer Token}}
  Content-Type:  application/json

Body:
  {
    "dataset": "sentiment",
    "modelName": "sentiment",
    "preprocessingLanguage": "en",
    "completionTime": 24,
    "memory": 8192
  }

Result:
  {
    "id": "sentiment-2018-12-01t2235z745432"
  }

Jobs

You can check, if the job is successful finished, when the status is SUCCEEDED.

URL: TEXT_LINEAR_RETRAIN_API_URL/jobs
Doc: https://api.sap.com/api/text_linear_retrain_api/resource
GET
https://mlftrial-retrain-text-linear-api.cfapps.eu10.hana.ondemand.com/api/v2/text/retraining/jobs

Results:
  {
    "finishTime": "2018-12-01T23:31:02+00:00",
    "message": "",
    "startTime": "2018-12-01T22:35:53+00:00",
    "id": "sentiment-2018-12-01t2235z745432",
    "status": "SUCCEEDED",
    "submissionTime": "2018-12-01T22:35:51+00:00"
  }

This took nearly one hour but I use over 300,000 sentiments (sentiment_5) for the training.

Logs

In case of failure or success, download the job logs:

mc cp --recursive saps3/data/<JOB ID>/ logs

example:

mc cp --recursive saps3/data/sentiment-2018-12-01t2235z745432/ logs

Step 6 – Deploy

The model must be deployed after a successful training.

Deploy Model

URL: TEXT_LINEAR_RETRAIN_API_URL/deployments
Doc: https://api.sap.com/api/text_linear_retrain_api/resource
POST
https://mlftrial-retrain-text-linear-api.cfapps.eu10.hana.ondemand.com/api/v2/text/retraining/deployments

Header:
  Authorization: {{Bearer Token}}
  Content-Type:  application/json

Body:
  {
    "modelName": "sentiment",
    "modelVersion": "1"
  }

Result:
  {
    "id": "f6b34f68-6bf0-4fe8-98f5-9f9a4310a9b8"
  }

After some time the model is available for a text classification.

Step 7 – Test

For my first test I’ve used this tweet from Witalij Rudnicki:


https://twitter.com/Sygyzmundovych/status/1061608300490440704

Text Classification

URL: TEXT_CLASSIFIER_URL/models/{model}/versions/{version}
Doc: Inference Service for Customizable Text Classification
POST
https://mlftrial-text-classifier.cfapps.eu10.hana.ondemand.com/api/v2/text/classification/models/sentiment/versions/1

Header:
  Authorization: {{Bearer Token}}
  Content-Type:  application/json

Body:
  texts=Starting sampling of a next batch of dark beers. This one has nice velvety taste, but way too sweet. 🍺 - Drinking a Świderskie by Cerkom @ Oporów —

Here is the result for this 88.4% positive tweet:

{
  "id": "6288fb40-1671-4c09-7cec-0baa12950d82",
  "predictions": [
    {
      "results": [
        {
          "label": "positive",
          "score": 0.8846777437444767
        },
        {
          "label": "negative",
          "score": 0.11532225625552328
        }
      ]
    }
  ],
  "processedTime": "2018-12-02T14:24:14.242227+00:00",
  "status": "DONE"
}

I don’t want to end this blog post with a negative sentence, but you can find one in my Postman collection.

have fun :goofy face:

To report this post you need to login first.

Be the first to leave a comment

You must be Logged on to comment or reply to a post.

Leave a Reply