Document Extraction with SAP Process Automation– ‘Automated Template Detection’ feature
I’m writing this blogpost with immense pleasure and excitement today. This is not just an announcement but also a How-To about a major feature release ‘Automated Template Detection’ which is one of the most awaited features requested by our customers frequently for a while. It is aimed at improving the way we carry out document processing further easing up the bot creation in a smarter way. (Of course, one amongst the many great upcoming features!)
This feature was released in April 2022 and was made available to all our existing RPA customers and the SAP Process Automation customers as an integration feature with Document Information Extraction Service.
But before we start, a few words on SAP Process Automation for those who aren’t yet used to the context –
ABOUT SAP PROCESS AUTOMATION
SAP Process Automation combines the capabilities of SAP Workflow Management & SAP Intelligent Robotic Process Automation into an intuitive no-code experience. This is a significant step towards simplifying process automation and enabling more people within the organization to participate in automating processes. Please have a look at the news article published by Bhagat Nainani.
Here is a short 3-minute video which provides a good insight on SAP Process Automation. Hope it piques your interest even further.
If you are new to SAP Process Automation –
- SAP Process Automation is available in SAP BTP in two ways – CPEA or Pay-as-you-go (PAYG) commercial models.
- To get started, please refer to PAYG.
- For tutorial on Pay-as-you-go account creation, please refer to SAP Developer Tutorial .
You can find more info on SAP Process Automation by accessing SAP Discovery center. You will be able to see information on DC availability, Pricing, and more assets when they are made available.
- To get started and how to subscribe with Sap Process Automation, please refer to this blog.
Note – You can try SAP Process Automation free of charge on SAP BTP free tier.
For the existing RPA customers –
- Installation instructions are available here in the Help Portal
- Knowledge about Projects, Automation and Tutorials, refer to Tutorials
- To learn how to create and use templates, please refer to this blog.
Where to find the Automated Template Detection –
- Select Dependencies from the combo box, click on Manage Dependencies button shown below
2. Click on Add dependencies
3. Look for “SAP Intelligent RPA Document Information Extraction SDK” and simply add it
4. All the document information extraction service activities would then appear as show in the image below
What is Automated Template Detection –
It is one of the most awaited features of document extraction which enhances the ‘Extract Data (Template)’ activity to smartly pre-screen and pre-extract data from your input pdf/image file to decide which template under a particular selected schema is the best fit and performs the extraction accordingly.
Which problem does it solve for intelligent document processing?
Earlier you had to place each template via each ‘Extract Data (Template)’ activity or build a logic for each input document to move from one activity to another for it to be picked by the correct template. This process was quite redundant and complicated. We received immense feedback during the Q4,2021 Beta from customers and wanted to make this part of the bot building as easy as it should be.
How the document extraction logic will be created now?
As mentioned in the beginning, we have you covered with the ‘Automated Template Detection’ and things will happen in a smarter way now.
After selecting the ‘Extract Data (Template)’ activity, you simply select your schema and select Detect Automatically and provide the document path. The Document Information Extraction Service does the rest for you. The backend algorithm does a pre-screening and selects the right template based on the fields pre-extracted!
Where do you find the ‘Automated Template Detection’ feature?
As shown in the screenshot below, this feature is part of the ‘Extract Data (Template)’ activity.
Let us fast forward a few initial steps which have been covered in previous blogposts and use just a document extraction specific example –
- Create an Automation Project ‘XYZ’
- Create a template artifact by choosing an existing template/create new template, then select your Document Type / make a custom Document Type and select an existing schema / pick an SAP global schema like ‘SAP_invoice_schema’ / create your own custom schema.
For all the steps on how to do it, please refer to this blog.
Note – When you begin a project with creating a template artifact, the dependency ‘Document Information Extraction SDK’ is automatically added to your project along with the ‘PDF SDK’. Hence, you do not have to do that manually.
3. Repeat step 2 to create multiple templates as per your need
Now to the steps of the topic in attention – ‘Automated Template Detection’
4. Add the ‘Extract Data (Template)’ activity to the Automation Workspace
5. Click on the activity to bring up the details panel on the right
6. Select your schema (Eg. – SAP_Invoice_Schema)
7. ‘Detect Automatically’ feature gets automatically selected
- Provide your document path / a variable containing the document path
- Click the Save
Pretty short and straightforward! Now let us move to the execution part to show something interesting during testing –
- Search “log” and drag and drop the ‘Log Message’ activity to the automation workspace
- On the right panel, click the message text box and select ‘(1) ExtractedData’ within the message text box
- Below the message text box, click the ‘type’ textbox and select ‘Info’
13. Put a breakpoint on the ‘Log Message’ activity, an orange dot appears next to the activity
14. Click the Save Button
Now run the bot to test it
- The bot will pause after the first step, then click on ‘Extract Data (Template)’ activity on the left debug panel and on the right panel, expand ‘schemaUid’ Input Parameter
16. When you check the ‘Identifier’, you notice that it has identified the correct template based on the input document which can be cross-examined here at the debug level
- Then continue running the test and let it end, so you can see the extracted result in the Info panel below which can be expanded further for preview
18. The Example ENDs here.
By reading this blog post, you have learnt about the new ‘Automated Template Detection’ feature, it’s significance in the overall process and its usage. Lastly, I hope this blogpost has given you a good start to explore the Document Information Extraction Service activities within SAP Process Automation and to check out the template artifact creation.
Thanks for reading and feel free to leave a comment with questions or feedback .
Stay tuned for more updates on Document Information Extraction Service in SAP Process Automation.
Please refer to the following links for steps and further information –
- For Enrichment Activities and all the related sub activities –
- For examples on using JSON script –
- How to create schemas and templates –
- For Best Practices on using activities related to Document Information Extract Service –
For more information on SAP Process Automation: