What is Business Process Observability and why does it matter?
In data-driven process management, we should not naively assume that all we need for the generation of actionable insights is a dataset (event log). In addition, we always need to assess to what extent the process we want to analyze and improve is observable from the data, i.e., to what extent the data provides a sound and complete foundation for process analysis. To this end, this post provides a range of methodological means for managing process observability.
Introduction – Why Observability Matters
In big data analysis, data observability is an emerging term that encompasses the continuous governance of data to identify and mitigate data health issues, i.e., problems with the correctness/soundness or completeness of data . Data observability is, quite clearly, a crucial prerequisite for reliable data analysis: if the data is not correct or not complete, the conclusions we draw from it can be less useful, counterproductive, or even dangerous. By now we know that data quality problems like missing data or the incompleteness of event log data are pressing issues in process mining . Obtaining event logs alone is insufficient for ensuring that a process is properly analyzed. We can highlight this by introducing three simple examples, which we will reference throughout the remainder of the post, and which are aligned with three key questions about process observability.
- Process scope observability (completeness): Do we observe all parts of the process?
Example: In a hiring process, it is common that substantial parts of the communication between applicant (and in particular the successful applicant), hiring manager, and other stakeholders occurs via channels that are not covered by the talent management system.
- Data quality observability (soundness): Do we observe the process exactly in the way it occurs in reality?
Example: The data entry steps of a process are handled by various Robotic Process Automation (RPA) bots. The bots may run in different technical execution environments and may not produce logs of perfect quality, i.e., events that occur in the real world may be missing from the log or have no or an incorrect/imprecise time stamp assigned. Hence, we may infer factually wrong activity orderings from the RPA log data.
- Experience observability (human-centricity): Are we aware of how process participants experience the process?
Example: A customer service process is analyzed, and while the data is correct and covers all process steps, feedback about the customers’ experiences, which could potentially be obtained via additional channels (such as social media) is not considered.
In the remainder of this post, we lay out different perspectives on business process observability, to then conclude with a brief outline of how process management practitioners can increase process observability in their organizations.
Perspectives on Business Process Observability
Managing the lack of total observability in practical reasoning and decision-making scenarios (and beyond) is a key challenge at the hearts of different domains and fields of scientific inquiry . Below, we provide some examples of some of the ways of how different disciplines deal with the problem to then relate the presented approaches to process observability.
Management – Known Unknowns and Unknown Unknowns
From a management perspective, we can categorize facts about a process into four categories, based on the level of awareness we have about these facts, using the well-established known/unknown approach .
- Known Knowns: explicit knowledge, or knowledge that we are aware of. In process management, known knowns are the ‘hard’ pieces of knowledge we have in the form of process models or event log/data analyses, if these artifacts are in fact correct.
- Unknown Knowns: tacit/implicit knowledge, or knowledge that we are unaware of. In process management, unknown knowns are facts about processes that exist either in databases or peoples’ minds but have not been systematically elicited and analyzed. For example, internal process participants may know exactly how some process workarounds that do not occur in any event log are executed (Example 1), or customer sentiment information is in fact stored in a database, but not extracted and matched to the underlying process (Example 2).
- Known Unknowns: explicit lack of knowledge, or lack of knowledge that we are aware of. In process management, known unknowns refers to facts about a process that we have not yet obtained, but that we know of in principle. For example, we may already have mined an event log, but not asked process participants about their perspective or vice versa. Conversely, we may deliberately ignore data during the Extract-Transform-Load (ETL) process in order to simplify the process mining challenge.
- Unknown Unknowns: implicit lack of knowledge, or lack of knowledge that we are unaware of. In process management, unknown unknowns are facts about a process that we do not know exist/can be obtained. For instance, in Example 2, we may be entirely unaware of the data quality issues, and we may firmly believe in the correctness of the data. Still, the extent and impact of unknown unknowns can be reduced by making sure all important stakeholders are continuously involved in the management of a business process and feel motivated and encouraged to provide feedback. If this is the case, the unknown unknowns in the form of data quality issues can, for example, be discovered by a process domain expert who may know that some activities must necessarily occur (or must occur in an order that is different from the one obtained from the event log).
Obviously, in any business process management initiative, one should attempt to increase the known knowns, while decreasing unknown knowns and (all forms of) unknowns. Having zero unknown knowns and unknowns amounts to full observability. To increase the known knowns (and our confidence in them), we may, for example break down data and organizational silos to uncover unknown unknowns and transform known unknowns to known knowns, and to continuously challenge what we think we know by improving our technical data analysis and data engineering pipelines, as well as the human and experience-oriented perspectives on that data.
Statistical Inference – Random and Systematic Errors
In statistics, observability issues are addressed by considering errors and computing confidence intervals. Observational errors can be either random, e.g., because sensor measurements are off by some factor, or systematic, e.g., because employees tend to over-report the time that is required to execute some specific task. Generally, it is hard to reduce random errors, whereas measures can be taken to reduce systematic errors. In the time-tracking example, execution times could be recorded automatically. However, even then, it is important to avoid a naïve, technocratic approach that may, for example, encourage process participants to trick the automatic time-tracking system. Also, the identification of systematic errors may require specific technical means. An important example for this in the business process management domain is drift detection: if the process/the business environment changes, this obviously affects process behavior and Process Performance Indicators (PPIs), but such changes can only be observed if the dynamicity of the process is considered when analyzing event log data. Thoroughly investigating potential systematic errors in process analysis is crucial, in particular to ensure that other statistical tools, such as tests and confidence intervals, are interpreted carefully and do not create a false impression of certainty about the analysis results.
Logical Inference – Knowledge versus Beliefs
In mathematical logic that formalizes knowledge representation and reasoning, the observability question is addressed by distinguishing between knowledge (statements that we know to be true about the world, and that will remain true in the future) and beliefs (statements that we think are true about the world, but that may no longer hold in the future). The underlying idea can be applied to process management, where both models and data typically have a symbolic representation . For instance, in process mining, we may observe traces that, in reality, did not occur exactly as observed. Consider Example 2 from above, in which a process runs across different technology environments and issues in and inconsistencies between these environments cause incomplete and incorrect ordering of activities (by timestamp) in the event log . If we have knowledge about the process, we can correct false beliefs about activity orders in event logs. For instance, if we have observed in our trace that an “invoice was first sent and then generated”, we can use a simplified process model that entails some core facts about necessarily required behavior to (automatically) change the activity ordering to “invoice was first generated and then sent”, thus revising our beliefs about how a particular process instance was executed.
Planning and Learning – Probabilistic Actions and States
In the domain of planning, observability challenges are formalized by treating actions and states using uncertainty quantifiers. As a classical example, Markov Decision Processes (MDPs) treat actions probabilistically: executing an action in a given state may result in any of a set of outcomes (where each outcome is the subsequent state and the resulting utility), each of which has a probability assigned. MDPs can be extended to Partially Observable Markov Decision Processes (POMDPs), in which uncertainty about the current state the system is in is also managed using probabilities. Because of the combinatorial explosion of potential states, many problems that can be modeled using MDPs or POMDPs are pragmatically solved with reinforcement learning approaches. The underlying ideas are relevant to business process management in the following ways: i) During analysis, generating probabilistic models like Markov Chains  can potentially be more insightful than the generation of symbolic models (like Business Process Model and Notation diagrams, Petri Nets): the former treat uncertainty as a first-class citizen, whereas the latter facilitate the misconception of symbolic precision in real-world processes. ii) When determining actions on process management level, i.e., when changing a process to improve performance, the models of probabilistic planning and reinforcement learning can provide the theoretical foundation for the systematic and continuous analysis of insights-to-action cycles.
Process observability is a fundamental problem in business process management and needs to be well-understood by practitioners – process management professionals as well as data science experts — who strive to run impactful process management initiatives, as well as by researchers and product developers who work on next-generation intelligent process management technologies.
Process Management Professionals: How to meaningfully increase organizational observability. In any process analysis scenario, practitioners should ask themselves from what views a process can be observed, and how a combination of different views can maximize process observability. For example, asking process participants about their experience with a process can help reduce the unknown unknowns about a process and allow for a better understanding of the parts of the process that take place outside of enterprise systems and can, in turn, inform a better system design that captures more process behavior in the future. This requires, however, a high-trust relationship with process participants – the technologies can only work if human aspects and relationships are placed at the center of the initiative. Broadly, process observability can be increased by a combination of two means: i) ensure a continuous dialogue of all stakeholders about the process; ii) continuously measure the process and benchmark it to other processes, ideally to best-in-class performers in comparable organizations, compare it to knowledge-based models and match it to subjective experience/sentiment data. When measuring and benchmarking, human experts should provide continuous feedback about underlying issues that cannot be captured by the data alone. Here, the interplay of models and data plays an important role. Also, it is crucial to make reasonable trade-offs between maximizing process observability and making efficient use of limited time and resources. Perfect observability is an unattainable goal and striving for it may lead to analysis paralysis.
Data Science Experts: How to tackle technical process observability challenges. When extracting and preparing process data, technical experts need to have observability in mind to identify risks and their mitigation approaches. Decisions that are made when implementing the ETL pipeline need to be documented alongside observability limitations that are discovered, in particular with respect to their implications for the confidence one can have in the resulting analysis. Interdependencies between data extraction and analysis goals need to be documented as well. Ideally, uncertainties are quantified and visualized in the analysis. However, highlighting uncertainty using traditional statistical approaches like confidence intervals and p-values alone is often not sufficient: the impact of other design decisions, such as the degree of simplification that a process model imposes given an event log or purely qualitative uncertainties are not reflected by these statistics indicators.
Researchers and Product Developers: The future of business process observability. In the future, observability may emerge as a ‘first-class citizen’ in intelligent process management. Recently, the academic core community that works, e.g., on process mining foundations, has started to embrace observability-related challenges. Examples are, e.g., foundational works that facilitate work with partial activity orderings as obtained from event logs  and that allow for the estimation of activity execution times  (which often cannot be obtained from event logs directly). This divergence from the previously dominant assumption that event logs are, in practice, readily available artifacts that are sufficiently correct and complete to serve as the sole basis for process analysis and improvement, is a crucial steppingstone towards more mature, actionable, and human-centric intelligent process management. It is the key motivator for industry-driven innovation initiatives, such as native event log generation , i.e., the generation of multiple event log views that are aligned with different analysis goals already during run-time.
Did you like this post? Follow SAP Signavio @ SAP Community for more content like this, and subscribe to our Product News Subscription Center to be always up-to-date on the latest product news at SAP Signavio.
Notes and References
1. Belief revision in knowledge representation and reasoning: Gärdenfors, P., Rott, H., Gabbay, D. M., Hogger, C. J., & Robinson, J. A. (1995). Belief revision. Computational Complexity, 63(6), 35-132. Oxford University Press;
- Probablistic models in planning, such as Partially Observable Markov Decision Processes (POMDPs): Spaan, M. T. (2012). Partially observable Markov decision processes. In Reinforcement Learning (pp. 387-414). Springer.
- Theory of mind in cognitive science and human-centered artificial intelligence research: Baron-Cohen, S. (1999). The evolution of a theory of mind. In M. C.Corballis& S. E. G. Lea (Eds.), The descent of mind: Psychological perspectives on hominid evolution (pp. 261–277). Oxford University Press.
- Quantum mechanics in physics: Privat, Y.,Trélat, E., &Zuazua, E. (2016). Optimal observability of the multi-dimensional wave and Schrödinger equations in quantum ergodic domains. Journal of the European Mathematical Society, 18(5), 1043-1111. European Mathematical Society.
 Gervasi, Vincenzo, Pete Sawyer, and Bashar Nuseibeh. “Unknown knowns: Tacit knowledge in requirements engineering.” 2011 IEEE 19th International Requirements Engineering Conference. IEEE Computer Society, 2011.
 E.g., a trace can be seen as a model of a temporal logic formula.
 This is, in fact, a well-known problem in distributed systems, which motivated the Turing award-winning invention of so-called logical clocks.
 For an application example see, e.g.: Jalali, Amin, et al. “dfgcompare: a library to support process variant analysis through Markov models.” BMC Medical Informatics and Decision Making 21.1 (2021): 1-13.
 Van der Aa, Han, Henrik Leopold, and Matthias Weidlich. “Partial order resolution of event logs for process conformance checking.” Decision Support Systems 136 (2020): 113347.
 Fracca, Claudia, et al. “Estimating Activity Start Timestamps in the Presence of Waiting Times via Process Simulation.” International Conference on Advanced Information Systems Engineering. Springer, Cham, 2022.
 Which is closely related to the concept of object-centric process mining, see: van der Aalst, W. M. (2019, September). Object-centric process mining: Dealing with divergence and convergence in event data. In International Conference on Software Engineering and Formal Methods (pp. 3-25). Springer.