AI-powered data classification using Data Attribut...

mhaas123 · ‎07-30-2021

When an enormous amount of structured and unstructured data is generated in real time, things can get complicated very quickly. But it is crucial to organize such data, as it can potentially lead to new business opportunities, enhanced efficiency, and significant savings. Organizations and business units often struggle with handling inconsistent and inaccurate data, as incomplete or inaccurate data prevents organizations from making good decisions. However, correcting and completing data sets usually requires a great deal of costly and time-intensive manual effort.

SAP has a solution for that

Manually handling inconsistent data by matching and classifying often relies on a rules-based approach where employees spend most of their time searching for the right categories, descriptions, missing master fields, and so on, which is tedious and time-consuming. This is where SAP comes in. The AI Business Service – Data Attribute Recommendation can help automate this process by identifying and extracting key information from documents and datasets, and recommending categories and sub-categories (or other customer-specific information) to use to organize the data.

A few weeks ago, I conducted a workshop called “How to use Data Attribute Recommendation to automatically classify material master data with SAP AI BUS” as part of the Hyperautomation Webinar Series. Given the great response and positive feedback, I thought it would be a good idea to share the full-length video from that session here.

During the webinar, I covered four main points: first, using the service’s different end points to upload data; second, using that data to train models; third, activating models; and fourth, using the Data Attribute Recommendation inference end point to predict values and generate probabilities.

Uploading training data

Data Attribute Recommendation service uses historical data to train a model. The user must first define a schema that consists of features and labels, and then must upload a dataset that contains those same elements.

Training the model

Once the data is uploaded, end user can now initiate the model training process. The end result of the training process is a trained model that can now predict missing attributes. (Note that this process might take several minutes depending on the size of your dataset.)

Activating the model

Once model is trained it is now ready to be used and needs to be activated. (Note that the service supports training multiple models that can be deployed in parallel.)

Using inference

Once you have activated the model, you can use the inference end point to predict missing or incorrect values. Along with predicted values, the result also provides the probability score.

For more information, refer to the links below: