Let's build an OCR (optical character recognition) app for Android with Cordova and Tesseract. With this we can leverage any SAPUI5 app with the OCR functionality.
What is Tesseract?
According to its site, Tesseract is probably the most accurate open source OCR engine available and it can read a wide variety of image formats and convert them to text in 60 languages. For complete explanation, please visit https://code.google.com/p/tesseract-ocr/
What you need:
Build Tess-two
cd tess two
ndk-build
android update project --path c:\programs\tess-two --target android-22
Create Cordova Project
cd OCR
cordova platform add android
Modify Java Source Files
if (!(new File(DATA_PATH + "tessdata/" + lang + ".traineddata")).exists()) {
try {
AssetManager assetManager = getAssets();
InputStream in = assetManager.open("tessdata/" + lang + ".traineddata");
OutputStream out = new FileOutputStream(DATA_PATH
+ "tessdata/" + lang + ".traineddata");
byte[] buf = new byte[1024];
int len;
while ((len = in.read(buf)) > 0) {
out.write(buf, 0, len);
}
in.close();
out.close();
Log.v(TAG, "Copied " + lang + " traineddata");
} catch (IOException e) {
Log.e(TAG, "Unable to copy " + lang + " traineddata " + e.toString());
}
BitmapFactory.Options options = new BitmapFactory.Options();
options.inSampleSize = 4;
Bitmap bitmap = BitmapFactory.decodeFile(_path, options);
ExifInterface exif = new ExifInterface(_path);
int exifOrientation = exif.getAttributeInt(
ExifInterface.TAG_ORIENTATION,
ExifInterface.ORIENTATION_NORMAL);
Log.v(TAG, "Orient: " + exifOrientation);
int rotate = 0;
switch (exifOrientation) {
case ExifInterface.ORIENTATION_ROTATE_90:
rotate = 90;
break;
case ExifInterface.ORIENTATION_ROTATE_180:
rotate = 180;
break;
case ExifInterface.ORIENTATION_ROTATE_270:
rotate = 270;
break;
}
Log.v(TAG, "Rotation: " + rotate);
if (rotate != 0) {
// Getting width & height of the given image.
int w = bitmap.getWidth();
int h = bitmap.getHeight();
// Setting pre rotate
Matrix mtx = new Matrix();
mtx.preRotate(rotate);
// Rotating Bitmap
bitmap = Bitmap.createBitmap(bitmap, 0, 0, w, h, mtx, false);
}
// Convert to ARGB_8888, required by tess
bitmap = bitmap.copy(Bitmap.Config.ARGB_8888, true);
TessBaseAPI baseApi = new TessBaseAPI();
baseApi.setDebug(true);
baseApi.init(DATA_PATH, lang);
baseApi.setImage(bitmap);
String recognizedText = "";
recognizedText = baseApi.getUTF8Text();
baseApi.end();
Calling Tesseract from index.html
Call Tesseract plugin (callNativePlugin) with parameter image URI.
function callNativePlugin(imageURI) {
var tesseractPlugin = cordova
.require('com.tesseract.phonegap.tesseractPlugin.TesseractPlugin');
tesseractPlugin.createEvent(imageURI, nativePluginResultHandler);
}
If success, we get back the result from nativePluginResultHandler:callback and print the result in html page.
function nativePluginResultHandler(callback) {
alert("Result: " + callback);
var result = document.getElementById("result");
result.innerHTML = callback;
}
I have attached the modified java files and index.html.
Build Cordova Project
Run cordova build under C:\programs\OCR\platforms\android. If there is no error, you will get the debug apk for testing:
Reference:
http://gaut.am/making-an-ocr-android-app-using-tesseract/
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
6 | |
6 | |
5 | |
4 | |
3 | |
3 | |
3 | |
3 | |
3 | |
2 |