How ChatGPT works

vmadhuvarshi_ · ‎07-24-2023

Blog 1: How ChatGPT works

Blog 2: How SAP Customers are Using ChatGPT

Blog 3: ChatGPT for Business Process Optimization & Data Cleansing

Blog 4: ChatGPT-SAP Integration | Challenges and Solutions

Welcome to this blog series, where we'll explore ChatGPT - OpenAI's groundbreaking Large Language Model (LLM) - and its integration with SAP systems. As Artificial Intelligence (AI) continues to redefine the business-technical landscape, it is crucial for business leaders to grasp not only the relevant use cases but also the potential pitfalls of utilizing revolutionary tools such as ChatGPT, particularly in conjunction with crucial data-centric systems like SAP.

In this blog series, we seek to explore the basics of ChatGPT and its application within SAP environments. Our goal is to initially equip you with a required understanding of LLMs like ChatGPT before transitioning to ChatGPT's practical applications, potential challenges, and best practices for integration with SAP. We do not intend this series to be a technical guide; instead, it's a primer that explores the capabilities, applications, and intricacies of ChatGPT in the SAP ecosystem.

LLM Capabilities | https://github.com/Hannibal046/Awesome-LLM

Here is how we envision this blog series to unfold:

This first blog tries to simplify the inner workings of ChatGPT and highlights the fundamental principle that powers it: 'Attention is all you need.' We expect this blog to break the persona of ChatGPT as an all-knowing system that it is sometimes regarded as. We want to show that, at its core, it is an AI model with a vast input but still operating on probabilistic principles and, more importantly, fallible.

In the upcoming blogs, we will explore real-world applications of ChatGPT within SAP environments, highlight core concerns of ChatGPT and SAP integration, and list a pragmatic approach to integrating ChatGPT with SAP systems.

Now, let us explore how ChatGPT works.

How ChatGPT works

AI is revolutionizing industries and business processes worldwide, with ChatGPT leading the charge. This AI model can produce text often on par with, and occasionally superior to, human-generated content. One notable improvement in ChatGPT is its internet browsing feature, which addresses the issue of its knowledge being limited to a cut-off training date in September 2021 [*As of the time this blog post was published, ChatGPT has temporarily deactivated this feature]. Moreover, the incorporation of plugins has significantly broadened ChatGPT's functions. As of this writing, it can perform a wide array of tasks using any of its 250+ available plugins. Before we explore the challenges of using ChatGPT with SAP, let us understand the fundamental architecture on which ChatGPT and other LLMs are based.

Natural Language Processing and Large Language Models

NLP is an integrated field that links computer science, artificial intelligence, and linguistics to enable computers to understand, interpret, and generate human language ("Natural Language Processing (NLP) in AI | SpringerLink"). LLMs, like OpenAI's GPT-3 (ChatGPT), are a specific type of NLP model and represent the cutting edge in terms of generating human-like text.

A typical NLP process involves several stages, each serving a unique purpose in transforming human language into a form that machines can understand and vice versa. The following image shows a typical NLP process.

Credit: Guodong (Troy) Zhao | https://bootcamp.uxdesign.cc/how-chatgpt-really-works-explained-for-non-technical-people-71efb

Here's a brief description of each stage. Step numbers match the numbers in the image.

Preprocessing: This initial stage involves cleaning the input text. For example, if the input is a ChatGPT prompt, preprocessing would involve removing any special characters or numbers that might not be relevant to the context.

Encoding/Embedding: Once the text is clean, it's transformed into a numerical representation. This step is crucial as language models can only process numerical data.

Model Processing: The encoded input is fed into the language model like GPT3. The model processes this input and generates a probability distribution over the potential next words.

Decoding: Post-model-processing, the decoding step, comes into play. This involves translating the numerical vectors back into human-readable words.

Post-processing: The final stage involves refining the output text. This includes spell-checking, grammar checking, adding necessary punctuation, and proper capitalization.

Each stage in this process is critical in ensuring the language model can effectively understand, process, and generate human language.

Transformer Neural Networks

ChatGPT is built upon a variant of 'transformer neural networks,' a concept introduced in the seminal 2017 paper "Attention is All You Need" by Vaswani et al. This paper revolutionized the field of natural language processing by proposing the transformer model, which uses "attention mechanisms" to better understand the context and relevance of words in a sentence.

The attention mechanism allows the model to focus on the most relevant parts of the input when generating each part of the output. It does this by comparing the current output position to every input position using a compatibility function to assign scores. The input elements are then combined with weights derived from these scores to form the input representation for the current output position. This lets the model look at all parts of a long input sequence no matter where in the output it is currently generating, enabling it to learn long-range dependencies.

ChatGPT leverages this transformative model, learning from a diverse range of internet text to predict the next word in a sentence based on the context of the previous ones. While this mechanism allows it to generate unique, contextually relevant responses, making it a dynamic conversational partner, it is crucial to understand that ChatGPT generates the following words in a sentence based on what has been generated so far. It doesn't possess inherent knowledge or the capacity to understand or verify the accuracy of the content it produces.

Attention is all you need | https://arxiv.org/pdf/1706.03762.pdf

Softmax Function

The Softmax function, marked 1 in the previous image, is used by language models like ChatGPT to predict the next word in a sequence. The word with the highest probability, marked 2 in the image above, is selected as the predicted next word, allowing the model to generate coherent and meaningful sentences. While this may change, as of today, ChatGPT's text generation goals do not include the terms 'verifiable,' 'factual,' or 'truthful.'

Softmax Function Explained

Softmax example: A marketing team has built a predictive model to estimate the likelihood that customers will purchase each of 5 different products. The model assigns each product a score between 1 and 10 reflecting how likely that customer is to purchase it. To turn these scores into purchase probability estimates that sum to 100%, they run the scores through a softmax function. It converts the scores to positive exponential values, then divides each by their sum to normalize them. So a product with a score of 8 gets a much higher probability than one with a 3, but all 5 probabilities together sum to 1. This allows the business to estimate what percent of purchases will be for each product given the model's scores, rather than just seeing the raw scores. The softmax function converts arbitrary scores into valid probability distributions.

To conclude, OpenAI's Large Language Model, ChatGPT, is revolutionizing industries through its ability to generate human-like text, powered by advancements in Natural Language Processing (NLP) and transformer neural networks. ChatGPT's enhanced features, such as internet browsing and plugins, have broadened its functions, enabling it to perform various tasks.

However, understanding its abilities and potential risks is vital for its effective implementation. ChatGPT, like other NLP models, undergoes several processing stages, including preprocessing, encoding, model processing, decoding, and post-processing, to transform human language into a form that machines can understand and vice versa. It uses a variant of transformer neural networks introduced in the 2017 paper "Attention is All You Need" to better understand the context of words in a sentence. Despite its sophisticated design, it's important to note that ChatGPT generates words based on previous context and does not inherently understand or verify the accuracy of the content it produces.

In the following blog, we will discuss how SAP Customers are currently using ChatGPT.

SAP notes that posts about potential uses of generative AI and large language models are merely the individual poster's ideas and opinions, and do not represent SAP's official position or future development roadmap. SAP has no legal obligation or other commitment to pursue any course of business, or develop or release any functionality, mentioned in any post or related content on this website.