Real-time … Really?
In the context of customer data management (CDM), there is an increasingly important requirement for “real-time”: customer data is collected and processed in “real time”, the customer journey has to take place in “real time” and, last but not least, a company must be able to evaluate reports in “real time”. But what does “real-time” actually mean? In which contexts and with which acceptable latencies (the actual opposite of “real time”!) should it be implemented? And what does all that mean for the underlying architecture?
What do you mean, “now”?
It was not so long ago that snapshots of ERP and CRM systems were taken and imported into a BI system, from which evaluations of a company’s economic situation were created. A given report would have referred to a period of time one or two days prior, but could not offer insight into the company’s current state. As such, a report on the state of the company up-to-date as of 15 minutes ago would offer a “real-time view” that was more than sufficient for the needs of, say, a business analyst. Consequently, a business analyst would consider a report on the state of the company with a timeliness of 15 minutes ago as having a real time perfectly sufficient for their needs.
Now let’s turn our attention to a different scenario, one of a customer browsing a company’s website, signing up and placing an order. Personalization based on the prospect’s search criteria and browsing behavior needs to be in the range of the prospect’s time spent on the website if it is to give the feeling of real time. As a user moves from prospect to registered customer this need increases, because the customer expects to be recognized and to experience personalization for all company touchpoints. In this scenario, “real-time” in the realm of minutes is enough to provide a satisfactory real-time experience for the customer.
Now let’s take a gamer who has to buy certain assets from a gaming platform while the middle of a battle in order to avoid elimination from a tournament. A latency of just one second can be decisive in the game; the gamer needs “real-time” to be in the millisecond range.
“Real-time” then is relative depending on context (I’m sure Einstein would agree with me here). Having a revenue bar set to move up one pixel per purchase and based on data captured by the millisecond is not likely to even be noticed by a business analyst, whereas ammunition provided to a gamer in an online battle fifteen minutes after the decisive moment is completely worthless, as the avatar will already be long dead. And let’s not forget that we may be looking at the same person needing different “real-time experiences” throughout their day: getting reports from BI systems all day at work, shopping online in the evening, and then shooting their way through a gaming world at night.
The term “real-time” is therefore contextual, but it is also subject to evolutionary development. Years ago, when data still lay in data silos and were not networked and interconnected, the expectation of their evaluation was lower. With the activation of data through harmonization and standardization, through the opening and networking of systems, and – last but not least – through the transfer of systems from on-site to the cloud, there is near constant evolution in data availability and processing needs. What’s more, things are moving so fast that what is acceptable today may already be obsolete tomorrow.
When we talk about real-time, we must then first clarify exactly what we are talking about. For our purposes today we are essentially talking about three aspects of data processing:
- “Real-time Ingestion”: A single system, or a large number of them, generates a queue of messages to be transmitted to a central system for storage and subsequent analysis. The focus is on secure and error-free ingestion, sequencing, and storage of the data. Errors must be automatically detected and cleaned up. Backup concepts must be able to restore a queue of messages, while a control instance ensures that incoming data streams are not lost.
- “Real-time Processing”: Data must be broken down into tasks and topics. Purchasing processes must be distinguished from communication processes. Service interactions must not be mixed with financial transactions. Incoming data must be harmonized and validated so that consistent data is extracted from the heterogeneous set of different data sources.
- “Real-time Analysis”: Insights must be drawn from the incoming data to trigger decision trees. They must be made available for current situation reports that provide insight into the observed facts at a specific point in time. Reports must be generated from the dataset of, for example, a data warehouse or data lake, and the process must not take longer than it takes the data on which it is based to become obsolete.
The real-time experience builds on these processes; without real-time ingestion, real-time analysis of stale data makes no sense. However, it may be that real-time data ingestion and processing is sufficient, while analysis is less time critical. The degree to which this process needs to be optimized depends on the use case. When considering “real-time” in an architecture, these three pillars of data use should always be taken into account.
Real-time Requirements in a Use Case
Let’s look first at a use case for the process on the customer side. After review, we’ll divide the real-time experience into our three pillars: “Data Ingestion”, “Data Processing” and “Data Analysis”. These three steps must be evaluated on a use case-specific basis as they relate to “real-time”. Our use case will be a customer’s order process:
Our customer is Mike. He first surfs around on the company’s website searching for a camera. As he searches, the website’s recommendation function provides him offers based on his searches that are more and more targeted. He adds a “Sony Alpha 6400” to his shopping cart and registers for a full account, but at this point he is interrupted in his shopping. A few days later, via a social media channel, he gets a targeted banner that prompts him to go back to the company’s website and complete the purchase, which he does. The order is captured and turned over to sales processing. Mike gets a feedback survey. Sales are recorded and reflected in the company’s reports.
Now, when is real-time actually necessary? And I mean REALLY necessary? Let’s take this use case and divide it into sub-processes 1-3 and 4-6, and then examine the necessity of each real-time requirement.
Create Customer Interest
First let’s look at steps 1-3 from this use case. Mike, our customer, surfs the website. Here, the click streams (i.e., the customer’s behavior on the website) must be recorded and evaluated as fast as possible so that Mike’s research quickly directs him to the right product. An estimation of the probability of closing the sale (the magic point at which a customer decides to buy a product from the company now and on the terms offered) and how that could be optimized must also be made. For this part of the use case, so-called “Context Driven Services” play a decisive role: they analyze the prospective customer’s interaction behavior (the potential customer is still unrecognized as they have not yet registered) and offer optimized, context-specific information (“Only 3 left in stock”, “€1049 – €969″, “Delivery tomorrow by 11:00 a.m. if purchased in the next 3 hours!”, etc.). Real-time is decidedly necessary here: stock quantities, pricing information and logistics terms must arrive as accurately as possible as the prospect is browsing, which is why the customer’s clickstream data from the CDS (remember? “Context Driven Service”) must be directly connected to the backend systems. Latency times here play out in the millisecond range, which is why the product data must either be queried in real-time by the backend systems or cached in the CDS. Furthermore, everything that the customer does is passed on to the CDP (“Customer Data Platform”), where an “Unknown Customer” profile registers the activities (and thus the preliminary classification of the potential customer) and carries out evaluations in advance in the direction of segmentation. In our example, the customer has stored the “Sony Alpha 6400” in his basket and registers on the website. The company’s CIAM system (“Customer Identity and Access Management”) runs through the registration process and queries the customer data, the consent, and the preferences via ECPM (“Enterprise Consent and Preferences”). A double opt-in process validates the customer and their consents for further processing of the customer data. These are passed from CIAM to the CDP (“yes, the Customer Data Platform”), where the cookie ID is recognized as an identifier for the existing profile and the profile is rewritten from an “Unknown” to a “Known Customer”.
Whew. A lot of stuff, huh? Deep breaths…we’re getting there! Take a moment to look at the graphic, it’ll give you a good overview.
For the overall process, this means that Real-Time Ingestion and Real-Time Processing are of primary importance with regard to product presentation. The CDM (another acronym… “Customer Data Management”) data for an initially still unknown customer profile also arise, but the real-time requirement is of secondary importance, as there is no activation requirement with this data in the CDP for the time being. Customer data only requires real-time processing at one point: during registration and when recording the customer’s consent and preferences.
In the process described, as far as real-time reporting is concerned, the primary focus is not on the analysis of website traffic. Of course, the log data of the customer’s clickstream serves as an important criterion for improving the offer. However, since the company’s planning horizon for the web store and for the product strategy can cover a range of days, weeks or even months, real-time reporting is not relevant to this case.
Let’s bring it full circle
Now let’s look at process steps 4-6 in our customer example. Mike is reminded of his interest in the camera by a social media marketing campaign and is directed back to the company’s website. Here he checks the offer again and decides to buy. He then evaluates his shopping experience, and the use case comes to an end.
What is happening here, and where is real-time necessary? Let’s take a closer look at the individual steps. To do this, we need to turn back time a bit and look at what happened before:
The marketing department did its job after Mike’s first visit. It gathered a group of customers from the CDP, all of whom were interested in purchasing a camera. One of the strategies used was to offer all these customers (including Mike) a 10% discount on camera items if they purchased within the next three days. This group was handed over to the social media provider, who displayed this offer to their subscribers (including Mike) in the social media app. This drove Mike back to the company website. Marketing could also have sent an email campaign directly to Mike’s inbox, and here again the CDP would have supplied the data for all addressees.
Notice anything different? At no point do we have real-time criticality here! The time window is now days, far from the requirements of an immediate response to Mike’s behavior. Now let’s turn our attention to step 4 in the process. Mike is back on the website and wants to order. He authenticates, checks the alternatives again, and then completes the order already in his shopping cart. The order is sent to Order Management for fulfillment, triggering a feedback survey asking Mike for his evaluation. All activities are recorded and evaluated in the CDP (in Mike’s profile). In parallel, his order is recorded in the data warehouse and later used for a report.
The graphic below provides you with another overview.
Time requirements that determine Mike’s shopping experience can be seen in the following places:
- Login to the website must be real-time. Consequently, the CIAM system must authenticate Mike in the millisecond to second range.
- The order must be passed to the order management system in milliseconds to seconds and acknowledged as Mike waits for confirmation of his order.
- Customer feedback should be solicited from Mike as quickly as possible; any delay reduces the likelihood that Mike will submit his evaluation, making the real-time aspect here also in the millisecond to second range.
- The data warehouse gets the order information from Mike. The business analyst who prepares a report on the company’s sales figures is satisfied if they receive a report that has a timeliness of the last 15 minutes.
- Order Processing receives the order and orders the shipment. Here we are in a time window of hours to days.
Where is the need for “real-time” here? It’s summarized in the table below:
Not so time-critical after all then?
Well, yes and no. The example suggests that real-time in such a use case refers more to Context Driven Services – meaning activity-based personalization of the web presence – and not on the customer’s profile data. But this is not so. Here are five counterexamples:
- A particularly important customer, who has already brought a lot of revenue to the company, opens a service call because they have had a negative customer experience. “Real-time” here can mean immediately stopping all marketing activities, activating support to address the customer directly, and/or activating a compensation process to regain the customer’s trust.
- An airline customer checks in at the airport. The airline registers this event, which triggers a threshold causing the airline to send the flight this customer is registered for out on time, using a replacement plane instead of waiting for the delayed plane that was supposed to complete this flight. The passenger is immediately directed to the new gate via his app on his phone.
- The weather forecast predicts a storm. Customers of an insurance company in the affected area receive a warning message via their insurance app coupled with a quote for building insurance that covers potential property damage related to the weather event.
- A customer uses their loyalty points to buy an Amazon voucher at checkout in a retail shop. The store operator must ensure that the points for the purchase are immediately deducted from the customer’s profile to prevent the customer from using them several times over in the time it takes the loyalty system to synchronize with the POS system.
- A car crashes into a tree. The car’s acceleration sensors register the impact, its IoT reports the incident to the car manufacturer, whose concierge service immediately contacts the driver and finds out about his health status.
So there are also many counterexamples that make real-time processing indispensable. As we come to the end of this post, think back to the triad of “Real-Time Ingestion”, “Real-Time Processing” and “Real-Time Analysis”. You should now be able to distinguish which components of an architecture are of particular time criticality, both in the scenarios given here and in real-world cases.
I’m not saying that it would be wrong to speed up processes or that today’s data processing requirements would be sufficient for any point in the future. Of course not! We will continue to experience how technology is rapidly evolving and how it will give us user experiences that we may not be able to imagine today. We’re currently seeing companies rise out of the IT basement into the cloud, and customer experience plays an important role in this evolution. For that, existing architectures are being turned upside down and major investments are being made. My contribution here is to highlight that it is worth taking a closer look at where the challenges lie and how they will evolve in the near future.
Here are the 4 major take-aways from this post:
- The need for real-time is context-specific and ranges from a few minutes down to a few milliseconds and depends, for example, on whether you are waiting for a company report for tomorrow’s committee meeting or buying ammunition in an online game during a tournament.
- Real-time processes build on each other. A distinction must be made in the architecture as to where real-time is expected in the processes: in the ingestion of data, in the processing of messages, or in the evaluation of information.
- Not every customer experience is immediately time-critical. Naturally the customer expects a high-performance and individualized interaction with company touchpoints. However, many processes run faster than a customer can act. It is therefore important to evaluate the individual process steps in each use case and to question the “real-time” requirement.
- Real-time is constantly evolving. Where today the customer places an order and receives their shipment the following day, tomorrow they may expect delivery via drone within hours or even minutes. Prospects such as these should also be considered in the architecture.
Remember that this is a general overview. Should you delve deeper into this topic, you’ll find it opens a real Pandora’s box. Things like “message queue”, “Parquet”, “Kafka” or “Pub/Sub” pop up, and you’ll start thinking about data streaming and batch imports. All that has been left out for now, as these aspects are addressed elsewhere and by more intricate technology.