Technology Blogs by SAP
Learn how to extend and personalize SAP applications. Follow the SAP technology blog for insights into SAP BTP, ABAP, SAP Analytics Cloud, SAP HANA, and more.
cancel
Showing results for 
Search instead for 
Did you mean: 
tomasz_janasz
Advisor
Advisor
Document Classification is part of the SAP AI Business Services portfolio, which encompasses a range of AI-enabled microservices on the SAP Business Technology Platform. Document Classification delivers one of the main capabilities for Business Document Processing, which I previously described in the blog “Simplify Business Document Processing with SAP AI Business Services”. This particular service enables companies, that are confronted with managing large numbers of business documents, to automatically classify documents based on custom classification categories.

This blog intends to give an update on the latest advancements in Document Classification. After reading, you will have an improved understanding of the purpose of this AI-enabled service, its capabilities, and commercials. Moreover, links to additional resources will enable you to get started with Document Classification on the SAP Business Technology Platform trial landscape.

What is Document Classification


Document Classification classifies business documents based on customized machine learning models. These models are trained based on a dataset of pre-classified (labeled) documents. The existence of the labeled dataset is a key prerequisite to train a custom model, as usual with custom ML projects. Document Classification takes advantage of selected NLP approaches: (1) Optical Character Recognition (OCR) to extract the text from documents and (2) selected text classification algorithms to train and classify. The service provides an automatic hyperparameter search which allows to select the best performing model. Once a model is available it can be used for serving (inference). For the inference, the service takes document image as input. During the inference the model assigns a corresponding class to the document and provides the probability. The figure below depicts a simplified process flow for training and using your own classification model with Document Classification.


Overview of Document Classification’s mode of operation.



Business Relevance


Typical Use Case


A typical use case for the Document Classification service is called Enterprise Mail-Inbox. On a daily basis companies have to process business documents attached to emails from their business partners (e.g., suppliers or customers). Usually, the documents arrive in a central enterprise email inbox, from where every document needs to be manually opened, classified, and dispatched. Such a manual approach is inefficient, error-prone, and can cause damage to business-critical processes, such as order processing.

In this scenario Document Classification can add value by automatically classifying large volumes of documents into customer-specific document types (e.g., invoices, dunning letters, and sales orders). The automation of this small step increases productivity of organizations by minimizing repetitive tasks and manual labor. The outcomes of this step can be used in subsequent processes as exemplified in the next paragraph.

Customer Reference: Villeroy & Boch Group


Document Classification in conjunction with SAP Intelligent RPA can provide added value by automatically classifying and dispatching incoming documents. In this way, the entire process flow can be automated as follows (see also next figure):

First, an intelligent bot screens incoming emails for attachments. The bot then sends all attachments to the Document Classification service for automatic classification. After this pre-processing, the bot can finally dispatch the documents and initiate the subsequent business process steps.

 


Enterprise email inbox handling


Exactly this use case is in productive usage by our reference customer Villeroy & Boch Group. Our customer states that they achieve an average automation of 92% for this particular business scenario. You can get more detailed information on our customer and their scenario in the customer reference deck and the SAP News Center article: When Bots Decide: Process Automation at Villeroy & Boch.

Document Classification: what is new?


Support for new file formats


Our Document Classification service now supports additional file formats, including single-page PNG and JPEG format. This enables classification of more documents without the need for converting them to PDF prior to processing. Consequently, our customers can save valuable resources and work more efficiently.

New pre-trained model


We have developed a new pre-trained classification model for invoices, payment advice, and purchase orders which is now available (see figure below). As of now, the model supports the German and English language.


Business document classes of the pre-trained document classification model.


Check out this this demo to get an understanding of how the pre-trained model can be used to classify business documents:


New service plan


So far, Document Classification and all other commercialized SAP AI Business Services have been charged in blocks of 1000 documents per month. Our new service plan has a reduced block size of 100 documents. The minimum purchase quantity is two blocks. You can get a detailed overview of the pricing in the SAP Discovery Center. Document Classification is also available as a subscription model via SAP Store. For more information on the new service plans of SAP AI Business Services, please refer to this blog.

Learn more


To find out more about our new pre-trained Document Classification model:

For more information on SAP AI Business Services:

Document Classification Questions | Document Information Extraction Questions

Business Entity Recognition Questions | Service Ticket Intelligence Questions

Data Attribute Recommendation Questions | Invoice Object Recommendation Questions