Technical Articles
How to use (Inline) Tesseract OCR with SAP Intelligent RPA
A few month ago presented the SAP Intelligent RPA developer team the great possibility to use Tesseract OCR for surface analysis in the context of automation. I discussed this in detail in a post on Linkedin.
Then I asked myself, is there no way to offer more possibilities of this approach. I mean this is a fully powerful local OCR solution. Then I analyzed the approach and found a simple fantastic entry point and I would like to introduce this here.
All necessary libraries and data files are available with the Desktop Agent. I wrote about the data files on LinkedIn. We are now focusing on the library TessOCRWrapper.dll. This is a dotNET wrapper library for Tesseract OCR. I wrote about the possibility to use dotNET languages via dotNETRunner with SAP Intelligent RPA here. And on this way we can use the possibilities of the wrapper library in the context of SAP Intelligent RPA.
Here my global.js code. The PowerShell code is embedded as hereString. Let’s take a closer look at the TesseractOCR function. At first we load the library. Then we set the language, which can be passed as a parameter, and initialize the library. Now we set the image, also passed as parameter, and let Tesseract determine the text via OCR. And that is all.
// *** Choose language (en|fr|de) ***
GLOBAL.labels.setLanguage(e.language.English);
// Global Systray object
var systray = ctx.systray();
//-Function hereString--------------------------------------------------
function hereString(f) {
return f.toString().
replace(/^[^\/]+\/\*!?/, '').
replace(/\*\/[^\/]+$/, '');
}
//-Function TesseractOCR------------------------------------------------
function TesseractOCR(BmpFile, Language) {
var dotNETRunner = ctx.activeX.create("dotNET.Runner");
var PSCode = hereString(function() {/*!
param(
[parameter(Mandatory=$true)][String]$BmpFile,
[String]$Language = "en"
)
[String]$IRPADir = "$(${env:ProgramFiles(x86)})\SAP\Intelligent RPA\Desktop Agent";
[String]$File = $IRPADir + "\TessOCRWrapper.dll";
Add-Type -Path $File;
[String]$Ret = $null;
switch($Language) {
"en" {
$TessLang = [TessOCRWrapper.TessAPI+Language]::En;
}
"de" {
$TessLang = [TessOCRWrapper.TessAPI+Language]::Ger;
}
"fr" {
$TessLang = [TessOCRWrapper.TessAPI+Language]::Fre;
}
}
try {
[TessOCRWrapper.TessAPI]$Tess = [TessOCRWrapper.TessAPI]::new();
$Tess.SetBaseModelDirectory($IRPADir);
$Tess.Init(
$TessLang,
[TessOCRWrapper.TessAPI+Tradeoff]::Accurate,
[TessOCRWrapper.TessAPI+PageSegMode]::PSM_SPARSE_TEXT,
[TessOCRWrapper.TessAPI+CharsetFilter]::All
);
[System.Drawing.Bitmap]$BMP = New-Object System.Drawing.Bitmap($BmpFile);
$Tess.SetImage($BMP);
$Ret = $Tess.GetText();
} catch {
$Ret = $Error[0].Exception.Message + " at " + $Error[0].InvocationInfo.Line;
} finally {
$Tess.End();
$Ret = $Ret.Replace("`n", " ");
$Ret = $Ret.Replace(" ", " ");
$Ret | Out-String;
}
*/});
var Ret = dotNETRunner.runPS_str(PSCode, "BmpFile = " + BmpFile + ", Language = " + Language);
return Ret;
}
/** main process start handler */
GLOBAL.events.START.on(function (ev) {
// *** Create Systray ***
systray.createSystrayMenu(ctx.options.projectName, 'ICON1');
systray.addMenu('', 'TesseractOCR', GLOBAL.labels.menu.main, function(ev) {
var OCRText = TesseractOCR("C:\\Dummy\\TessOCR2.png", "en");
ctx.log(OCRText);
});
});
/** main process stop handler */
GLOBAL.events.QUIT.on(function(ev) {
// add code here
});
/** Auto-update menu handler */
GLOBAL.events.UPDATECTX.on(function(ev) {
ctx.shutdownAgent(true, true, (ctx.options.restartConfirmation ? GLOBAL.labels.updatePopup.label : null), GLOBAL.labels.updatePopup.title);
});
Let’s see how it works.
Great
On this way we can use the existing OCR functionality of the Desktop Agent also for our automation approaches. With one code line…
var OCRText = TesseractOCR("C:\\Dummy\\TessOCR2.png", "en");
…we get the text of an image.
We can now use this in loops and process many images via OCR. And all with this local solution. No problems with the European General Data Protection Regulation, because we transfer no data.
Enjoy Tesseract OCR for this use cases with SAP Intelligent RPA.
Powerful!! Thanks Stefan for your sharing.
Hi Stefan,
I have followed a blog https://blogs.sap.com/2020/02/23/how-to-build-custom-ocr-in-sap-rpa/
and implemented OCR using Python.
Either Python/DoNet we need a external installations in End User Side.
Can you please let me know if we have anything in Dsktop Agent which can invok method to use OCR without any additional installations ?( TessOCRWrapper.dll. ..??)
Thanks And Regards,
Siva Rama Krishna
Siva rama Krishna Pabbraju
Hello Siva Rama Krishna,
the execution platform of SAP Intelligent RPA is the Windows Script Host with JScript language. As far as I know is it not possible with WSH to use dotNET libraries without any additional installations on this way.
Best regards
Stefan