Document Classification is part of the
SAP AI Business Services portfolio, which encompasses a range of AI-enabled microservices on the SAP Business Technology Platform.
Document Classification delivers one of the main capabilities for Business Document Processing, which I previously described in the blog “
Simplify Business Document Processing with SAP AI Business Services”. This particular service enables companies, that are confronted with managing large numbers of business documents, to automatically classify documents based on custom classification categories.
This blog intends to give an update on the latest advancements in
Document Classification. After reading, you will have an improved understanding of the purpose of this AI-enabled service, its capabilities, and commercials. Moreover, links to additional resources will enable you to get started with
Document Classification on the
SAP Business Technology Platform trial landscape.
What is Document Classification
Document Classification classifies business documents based on customized machine learning models. These models are trained based on a dataset of pre-classified (labeled) documents. The existence of the labeled dataset is a key prerequisite to train a custom model, as usual with custom ML projects.
Document Classification takes advantage of selected NLP approaches: (1) Optical Character Recognition (OCR) to extract the text from documents and (2) selected text classification algorithms to train and classify. The service provides an automatic hyperparameter search which allows to select the best performing model. Once a model is available it can be used for serving (inference). For the inference, the service takes document image as input. During the inference the model assigns a corresponding class to the document and provides the probability. The figure below depicts a simplified process flow for training and using your own classification model with
Document Classification.
Overview of Document Classification’s mode of operation.
Business Relevance
Typical Use Case
A typical use case for the
Document Classification service is called Enterprise Mail-Inbox. On a daily basis companies have to process business documents attached to emails from their business partners (e.g., suppliers or customers). Usually, the documents arrive in a central enterprise email inbox, from where every document needs to be manually opened, classified, and dispatched. Such a manual approach is inefficient, error-prone, and can cause damage to business-critical processes, such as order processing.
In this scenario
Document Classification can add value by automatically classifying large volumes of documents into customer-specific document types (e.g., invoices, dunning letters, and sales orders). The automation of this small step increases productivity of organizations by minimizing repetitive tasks and manual labor. The outcomes of this step can be used in subsequent processes as exemplified in the next paragraph.
Customer Reference: Villeroy & Boch Group
Document Classification in conjunction with SAP Intelligent RPA can provide added value by automatically classifying and dispatching incoming documents. In this way, the entire process flow can be automated as follows (see also next figure):
First, an intelligent bot screens incoming emails for attachments. The bot then sends all attachments to the
Document Classification service for automatic classification. After this pre-processing, the bot can finally dispatch the documents and initiate the subsequent business process steps.
Enterprise email inbox handling
Exactly this use case is in productive usage by our reference customer Villeroy & Boch Group. Our customer states that they achieve an average automation of 92% for this particular business scenario. You can get more detailed information on our customer and their scenario in the
customer reference deck and the SAP News Center article:
When Bots Decide: Process Automation at Villeroy & Boch.
Document Classification: what is new?
Support for new file formats
Our
Document Classification service now supports additional file formats, including single-page PNG and JPEG format. This enables classification of more documents without the need for converting them to PDF prior to processing. Consequently, our customers can save valuable resources and work more efficiently.
New pre-trained model
We have developed a new pre-trained classification model for
invoices,
payment advice, and
purchase orders which is now available (see figure below). As of now, the model supports the German and English language.
Business document classes of the pre-trained document classification model.
Check out this this demo to get an understanding of how the pre-trained model can be used to classify business documents:
New service plan
So far,
Document Classification and all other commercialized SAP AI Business Services have been charged in blocks of 1000 documents per month. Our new service plan has a reduced
block size of 100 documents. The minimum purchase quantity is two blocks. You can get a detailed overview of the pricing in the
SAP Discovery Center.
Document Classification is also available as a subscription model via
SAP Store. For more information on the new service plans of SAP AI Business Services, please refer to this
blog.
Learn more
To find out more about our new pre-trained Document Classification model:
For more information on SAP AI Business Services:
Document Classification Questions |
Document Information Extraction Questions
Business Entity Recognition Questions |
Service Ticket Intelligence Questions
Data Attribute Recommendation Questions | Invoice Object Recommendation Questions