Automatic dataset translation on the SAP Conversational AI platform using SAP Translation Hub API
Did you ever want your chatbot to respond in different languages but you realized that it takes too long to manually write the same expressions in several languages? Then this blog post is made for you. It’ll explain how to automatically translate your expressions from one language to another.
- You have a completed Python environment setup.
- You have Git installed on your computer.
- You need a bot already created on the SAP Conversational AI platform and you are the owner of this bot.
- You need an SAP Translation Hub account.
This blog post explains the steps to translate a whole dataset exported from the SAP Conversational AI platform and import it back with the correct annotations.
Get your SAP Translation Hub Credentials
First, you need to have an SAP Translation Hub account to carry out the translation.
You can access the SAP BTP cockpit and create an account if you don’t already have one.
In SAP Cloud Foundry, click on your space where you want to create your Document Translation instance.
Then click on Services > Instances and create your instance. Select the service Document Translation and choose an instance name.
In the instance you just created, generate a service key.
Then you can view and download your credentials.
Make sure you have the clientid and clientsecret that you’ll need to use the SAP Translation Hub API.
Get your SAP Conversational AI bot’s information
Then, you need to get your bot’s information such as the user slug, the bot slug, the version slug, the developer token, the bot’s ID and the bot’s secret.
Go to your bot page, right click anywhere and choose Inspect.
In the network tab, just click on the dataset request.
You’ll see a request URL where you can find the information related to the bot.
In this example we can identify:
- The user slug is after users/: annielim
- The bot slug is after bots/: music-bot
- The version slug is after versions/: 9dee—-784
To get your developer token, your bot’s ID and your bot’s secret, you need to go to your bot Settings > Tokens.
Click on Generate on DesignTime APIs and copy your Client ID and Client Secret.
You can copy your developer token that is available under Bot tokens.
You must make sure to have the user slug, bot slug, version slug, developer token, client ID and client secret corresponding to your bot at the end of this step.
Export the dataset
Now of course you have to export the dataset from the SAP Conversational AI platform.
On your bot’s page, click Export on the top right corner.
The export can take a while depending on the size of your bot. Click the number next to the Export button to see the export status.
Once it’s complete, you can download the dataset by clicking on the download button. A zip file is downloaded in your local Downloads folder.
Unzip the file you just downloaded. Your bot dataset is stored in the train folder.
You can either move this file to another repository or keep it where it is, but ensure that its path is accessible as explained in the next steps.
Translate the dataset and import it to the platform
- Clone the repository
To translate your dataset and import it to the SAP Conversational AI platform, open the Command Line Interface (CLI) and clone the repository with the following commands.
git clone https://github.com/SAP-samples/conversational-ai-rest-api-example.git
- Install the package
You can create a virtual environment at your current path:
python3 -m venv venv
Activate your virtual environment:
Install the necessary Python libraries:
pip install -r requirements.txt
- Translation with SAP Translation Hub
Enter the following commands and replace the arguments with the information related to your bot:
python3 ./bin/translate.py -p PATH -s SOURCE_LANG -t TARGET_LANG -user USER_SLUG -bot BOT_SLUG -version VERSION_SLUG -devtoken DEV_TOKEN -botid BOT_ID -botsecret BOT_SECRET -id CLIENT_ID -secret CLIENT_SECRET
Replace PATH by the path of the JSON dataset.
Replace SOURCE_LANG by the source language of the original dataset.
Replace TARGET_LANG by the target language of the translations.
Replace USER_SLUG by the user slug of the bot you have collected at the beginning of the tutorial.
Replace BOT_SLUG by the bot slug you have collected.
Replace VERSION_SLUG by the version slug of the bot you get earlier.
Replace DEV_TOKEN by the developer token you have collected previously.
Replace BOT_ID by the Client ID of your SAP Conversational AI bot.
Replace BOT_SECRET by the Client Secret of your SAP Conversational AI bot.
Replace CLIENT_ID by the Client ID of your SAP Translation Hub instance key.
Replace CLIENT_SECRET by the Client Secret of your SAP Translation Hub instance key.
Furthermore, you need to set the -id and -secret arguments. To know which languages are supported by SAP Translation Hub, see the documentation.
To download the translated dataset, add the argument –save in the CLI.
- End of the processing
The script first translates the synonyms and then the expressions. The translated dataset is imported to the SAP Conversational AI platform once the translation is completed.
Once the loading bar indicates completion, the translated expressions and synonyms are available in in the desired language in the platform. The expressions are annotated with gold, free and restricted entities. Yet, a manual verification on free entities tagging may be necessary.
Well done, you have successfully imported your translated dataset to the platform!
Thank you for reading. I hope this blog post helped you in automatically translating your chatbot’s expressions into a different language using the SAP Translation Hub API.
If you found this blog post useful, please share your feedback and comment about this use case.