Handling Chaos and Complexity When Building and Running Enterprise Software Systems
“In a nutshell Systems Thinking is about defining problems and designing solutions in an environment characterized by chaos and complexity.” (Jamshid Gharajedaghi)
We are moving towards cloud development. This introduces new levels of complexity when we build software systems. One aspect stems from the technology and architecture in the cloud. Microservices form loosely coupled, distributed systems. A second aspect is that not only do we have to build and run the software, but we also want to optimize the overall process. For example, we want to support constant incremental changes so that we can improve the business outcome of these systems based on short feedback loops.
This means that we’re heading deeper into an environment characterized by chaos and complexity. Here, we revisit the statement from Jamshid Gharajedaghi and propose that applying Systems Thinking might be helpful in this situation – whatever Systems Thinking means.
Applying Systems Thinking
Applying Systems Thinking consists of three steps:
- Step 1: Frame the system and formulate your problem.
- Step 2: Look through the systems lens
- Step 3: Apply Systems Thinking to design a solution
Let’s start with finding a system and formulating the problem we want to address.
We can describe our system and problem like this: Optimize the business value of a complex software system.
That looks reasonable. A software system should be a system, and optimizing the business value is the problem that we need to solve.
We’ve found a system and formulated the problem that we want to focus on. That was easy, maybe a bit too easy. The idea now is to abstract from our system of interest to the fact that this is a system. On that level of abstraction, we can have a look at Systems Thinking, which describes general system characteristics and issues that you might face with complex systems.
Here are some characteristics of complex systems (according to Donella Meadows and Jamshid Gharajedaghi):
- A system is more than the sum of its parts. The sum of good football players does not necessarily make a good team.
- Systems can show emergent properties. Think of how a group of fish can swim as a swarm.
- A system can show chaotic behavior. Small changes in input variables can result in very different results.
- Systems need to be managed not only for productivity or stability, they also need to be managed for resilience.
These statements will become important for us later on.
One issue in complex systems is that analytical thinking is not sufficient to understand or predict their behavior. This is due to the systems characteristics listed above. It means that it might be very hard to predict and control the behavior of a system that you’re building or changing.
How does Systems Thinking handle this? You need to use iterative feedback loops. In the words of Eric Ries, this means applying the Build – Measure – Learn pattern and iteration.
That’s basically it. It sounds easy – but it’s not.
- Instead of the normal analytic “cause and effect” thinking, you now have a feedback cycle and iterations. When developing a complex system, you’ll have a lot of feedback cycles. Each unit test, each performance test, each show-and-tell session, each UI that you design will result in additional feedback.
- As the system has emergent properties, you’ll also have feedback cycles at system level.
- Measurement is not always easy. Receiving feedback is one means of measurement. But sometimes you need to measure something and you need a KPI to do so. Defining KPIs can be hard.
- If there are long delays in feedback loops, some sort of foresight is essential (Donella Meadows). The shorter the feedback loop, the better.
- The Learn step is still the place where you can apply your analytical skills.
The Build – Measure – Learn pattern is a feedback loop. To implement the feedback loop, you’ll need to:
- Define technical and business metrics. For some examples, see Netflix’s Chaos Engineering and Evolutionary Architectures below.
- Measure the defined metrics.
So, now we’ve had a first look at systems from the viewpoint of Systems Thinking. However, we don’t want to just look at a system, we want to design a solution for our problem. Systems Thinking helps us do this in several ways:
- Holistic Thinking: Consider the wholes, the parts, and the surrounding systems.
- Operational Thinking: Consider the processes end-to-end.
- Design Thinking: Think like a designer when designing your solution.
- Sociocultural Systems
A system is more than the sum of its parts. When applying the Build – Measure – Learn pattern, we must address the parts level. For example, we can apply unit testing. However, we also have to address the system level.
The other part of holistic thinking is this: There are no separate systems (Donella Meadows). In our context, this means that other systems affect our system of interest. We’ll have to consider them as well. Here’s the purpose of our system and the optimization aspect of our problem: Optimize the business value. This business value is created on company level. What about a feedback cycle here as well? In fact, this might be the most important feedback cycle.
Netflix’s Chaos Engineering is about improving resilience on system level. This applies holistic thinking to software engineering.
Requirements for the systems architecture will manifest themselves in non-functional requirements. Many incremental changes can endanger those architectural requirements over time. You can protect them by defining metrics. These metrics can be on the process level (for example, performance requirements) or on the systems level (for example, resilience).
Operational Thinking is about the processes in a system. In our context, which is to optimize the business value of a complex software system, the user stories represent processes. Using the “so that” part, each user story is connected to a value that helps optimize the business value. A lot of feedback cycles are also involved here. They serve to design the single user stories and embed them in the system’s end-to-end processes.
User Story Mapping (USM) gives you a more holistic picture of the user stories:
Design Thinking? Yes! By this, we mean thinking like designers, but you also can use the Design Thinking process that we apply in a Sprint 0. A Sprint 0 is an iterative approach in which we work together with customers to discover, describe, and validate the requirements for a solution. Sprint 0 is based on Design Thinking, UX design, and agile Requirements Engineering. So, how’s this related to Systems Thinking?
Designers try to hit the sweet spot when defining product designs. This is also the center of innovation. You get there by applying the Build – Measure – Learn pattern in iterations. When you apply holistic thinking, you use this pattern for the overall solution and also for specific parts, such as processes or a UI. In terms of processes, we address this with Sprint 0. Again, it’s up to the team members to apply the Build – Measure – Learn pattern when planning to develop a system.
Design Thinking is the last part in this chapter, but it would be the first step when defining and designing a solution to our problem: Optimize the business value of a complex software system.
In a Sprint 0, we start to define the problem (initial problem statement), ideate on possible solutions, maybe reframe the problem, build prototypes – perhaps on paper, test the possible solution to learn more, and iterate this process.
Our system of interest is a technical system, not a social system like a society. Nevertheless, there is an aspect of social (user) interaction with the system that is of interest for our problem: To Optimize the business value of a complex software system, good User Experience (UX) is crucial. To get there, we follow a user-centric approach. This starts with the definition of personas and the assignment of user stories to the personas. Remember the template for user stories:
As a persona I want to need so that business value.
This introduces a direct relationship between the user’s interaction with the system and the resulting business value for that persona. What’s more, this relationship also describes the business value for the company. This is continued in the prioritization of the backlog, where the business value of the user stories is one important criterion.
The following three steps of UI design also aim to enable users to make their contribution to the overall business outcome:
- Information architecture
- Interaction designs
- Visual designs
That was a first round of applying Systems Thinking. We focused on this: Optimize the business value of a complex software system. Looking back at the introduction, we see that one aspect of cloud development is still missing. This is the process element that includes the integration of operations. This may require a second round. Ready for the next level of complexity and some more background on Systems Thinking?
In Applying Systems Thinking, our system and problem was the following: Optimize the business value of a complex software system. However, we noticed that this falls short on the process aspect of cloud development and also didn’t address operations.
So, this time we want to address the following problem: Build and run complex enterprise software systems. At the same time, we’ll take this opportunity to go deeper into Systems Thinking – in the hopes of finding more input to help us complete our task. This means that we want to start Thinking in Systems.
Again, we’ll have the following three steps:
- Step 1: Frame the system and formulate your problem.
- Step 2: Look through the systems lens.
- Step 3: Apply Systems Thinking to design a solution.
In addition, we’ll provide a generalization that can be applied to other systems and problems.
Systems Thinking is based on systems, so let’s take a look at what we mean by “system.”
A system has a purpose, boundaries, a context or environment, a structure, and some functions or processes. To keep it short, we won’t define a system here. Instead, we’ll take an enterprise software system as an example:
- The purpose of a computer system is typically to create some value.
- The boundaries are defined by its scope.
- The context is the enterprise and might include interaction with other software systems.
- The structure is the architecture, including the runtime architecture and the information architecture.
- The processes consist of the user stories or use cases it covers, for example.
There are other types of systems, such as a software project, a department, a company, the world, the capital market, a football team, or a society. They’re not of interest here, but you can apply Systems Thinking to those kinds of systems, too. You can profit from some of the general aspects of Systems Thinking and Systems Theory developed for such systems.
As stated above, we can describe our system and problem as follows: Build and run complex enterprise software systems.
Now we’re clear about what a system is, but what does it mean to think in systems? And isn’t building and running a complex enterprise software system also a kind of system?
We have the following elements:
- A purpose: We want to build and run the software system.
- Boundaries: This is the realization of scope defined by the scope document. This isn’t the full project – that’s another boundary.
- A context: Our context is the project and the system we’re building.
- A structure: This is given by the team’s structure.
- A process: This is the process we use to develop the system.
That’s our system. But then what’s the problem we’re trying to address? Let’s reformulate.
We can describe our system and problem as follows: We want to build and run a complex enterprise software system efficiently. We use “efficiently” in the sense that given a set time and budget, we try to optimize the value of the system that we’re building for an enterprise.
Systems Thinking provides a great body of knowledge for all systems. If you’ve found the system that you’re interested in, you can apply this knowledge.
Let’s remember the system characteristics and add some additional input (from Donella Meadows and Jamshid Gharajedaghi):
- A system is more than the sum of its parts. The sum of good football players doesn’t necessarily make a good team.
- Systems can show emergent properties. Think of how a group of fish can swim as a swarm.
- A system can show chaotic behavior. This means that small changes in input variables can result in very different results.
- System structure is often the source of system behavior.
- Systems need to be managed for resilience.
- Many relationships in systems are nonlinear.
- There are no separate systems.
Applied to our system and problem, this is what we must deal with when we develop and operate complex systems.
Let’s dig a little deeper into Systems Thinking and have a look at Peter Senge’s 11 Laws of Systems Thinking.
You’ve probably dealt with a lot of these topics during your career as a software developer or architect. There’s still no simple cure so far, but it’s good to know that you’re not alone. Please note that Peter Senge’s book isn’t about software engineering; it’s a management book. He applies Systems Thinking to the management of a company or department.
You won’t find the Build – Measure – Learn pattern as term in Systems Thinking, but you will find feedback loops.
The relationships between cause and effect are labeled with an s (same) or with an o (opposite). When cause moves in a given direction (generally up or down) and effect moves in the same direction, an s is used; when effect moves in the opposite direction of the cause, an o is used. Reinforcing type feedback loops are formed around an R; balancing type feedback loops are formed around a B.
From Donella Meadows we learn that reinforcing feedback loops can lead to exponential growth. However, there will always be limits to growth. An example in our system and problem for a reinforcing feedback loop is Amazon experimenting in the web shop and measuring the business outcome, described by the revenue difference.
There’s a central statement from Donella Meadows about balancing feedback loops: When there are long delays in feedback loops, some foresight is essential. You can find examples of balancing feedback loops in our context in the picture below. You can easily apply the statement on delays in feedback loops yourselves.
There’s a wealth of knowledge about systems available in Systems Thinking. You can have a look at the literature and find other characteristics or elements of a system that you can apply to your problem.
As in Applying Systems Thinking, we won’t be using this information as a how-to list for dealing with our problem. Instead, we’ll use some aspects that will be helpful when designing a solution.
Now we’re ready to design a solution to our problem. Again, we’re using the four foundations of Systems Thinking as defined by Jamshid Gharajedaghi in Systems Thinking – Managing Chaos and Complexity. The four foundations are as follows:
- Holistic Thinking
- Operational Thinking
- Design Thinking
- Sociocultural Systems
Applying Holistic Thinking to our problem means:
- Thinking at all levels of the system: A system is more than the sum of its parts.
- Thinking end-to-end in the systems context. For an example, see the next section on operational thinking.
- Including all of the systems that we have in the context. This means the project and the software system that we’re building. If we don’t consider them, we’ll receive feedback about timelines, budget, error messages, or an escalation.
We already covered this when applying Systems Thinking to our first problem, in which we focused on the software system. The new aspect now is that we’re also covering the process of developing and operating a software system.
Operational Thinking in the context of building and running software systems efficiently is about having an end-to-end view of how to develop software and running it. It’s about optimizing the development and operations process of an enterprise system.
Here, we can leverage the work of others. The topics are Agile Development, Continuous Delivery, and DevOps.
DevOps is about optimizing end-to-end Dev to Ops and represents Operational Thinking. The DevOps Handbook defines three major principles that elaborate the Build – Measure – Learn pattern. It even addresses – in the learning part – different systems (the individual, the project, and the organization).
- Build – Principles of flow: Enable and sustain a fast flow of work from evaluation through development, and on to production and operation
- Measure – Principles of feedback: Shorten and amplify feedback loops
- Learn – Principles of continual learning and experimentation: Enable constant creation of knowledge – individual, on team level, and in the organization
Again, we covered Design Thinking previously in the context of a software system. It was about finding a solution in the sweet spot. Here, it’s an additional process aspect in our current context.
Design Thinking supports the mindset to optimize the time to value. We do this for single UIs by building prototypes quickly in order to receive early feedback with relatively low effort. However, this also optimizes the overall time to value. It’s much cheaper and faster to search for an optimal solution in a Sprint 0 than to build production-ready software. Thus, the end-to-end process (Sprint 0 + DevOps) is improved and supports a faster time to value.
Building and running software systems – for example, in a project – is a social system. Each social system has a culture – a shared image of how things ought to be. The culture might be inherited from the company or department. In Design Thinking and DevOps you’ll also find a lot of mindset topics in addition to the tools you use.
- Your department might have decided to use Scrum or Kanban.
- DevOps mindsets such as:
- Automate everything
- Reduce handoffs
- Shorten and amplify feedback loops
- Design Thinking mindsets such as:
- Defer judgement
- Build on each other’s ideas
This culture and mindset topics might be introduced on department or company level, but it’s the team who must apply it to the individual problem of building and running a system. Take this example: a sprint retrospective is only useful if the team tries to make use of the learnings.
So far, we’ve remained within the context of enterprise software systems.
You can try to apply Systems Thinking to other systems and problems. To do this, just apply the three steps that we’ve walked through above. In all cases, you can use the body of knowledge available for Systems Thinking. If you’re lucky, somebody has applied Systems Thinking to your system of interest or at least to some part of it. Here are some examples:
Lean Startup is Design Thinking applied to starting a company. We’ve adopted some aspects of this concept, especially for value engineering in Sprint 0.
The manager’s system of interest is called a company or department. These systems are the focus of Peter Senge’s great book.
The previous examples were all relatively large topics. Let’s take a look at one more example that’s a bit smaller in scope. In the book Building Evolutionary Architectures, you’ll find the statement that microservices are one source of more chaos. Why is that?
From Donella Meadows, we’ve learned that system structure is often the source of system behavior. In software systems, the structure is the architecture. Is the microservice architecture producing more chaos than a monolithic architecture? That would be strange because microservices provide very good modularization. Modularization is also good in monolithic architectures. It must be a different part of the architecture. That part is the runtime architecture. Microservices run in separate containers. That’s a plausible source of more chaos. Something like a DoS (denial of service) attack is now also possible in a system because microservices call each other very frequently.
Thinking in systems opens the door to the body of knowledge of Systems Thinking. It’s a skill set. It’s something you can learn – like learning to swim. You can use these skills to build and run enterprise software systems.
- Thinking in Systems: A Primer by Donella H. Meadows
- The Fifth Discipline: The Art & Practice of The Learning Organization by Peter M. Senge
- Systems Thinking – Managing Chaos and Complexity by Jamshid Gharajedaghi
- The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses by Eric Ries
- The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations by Gene Kim and Patrick Debois
- The Goal: A Process of Ongoing Improvement by Eliyahu M. Goldratt
- The Phoenix Project: A Novel about IT, DevOps, and Helping Your Business Win by Gene Kim and Kevin Behr
- Building Evolutionary Architectures: Support Constant Change by Neal Ford and Rebecca Parsons
- Chaos Engineering by Chasey Rosenthal
- The Art of Monitoring by James Turnbull
- User Story Mapping: Discover the Whole Story, Build the Right Product by Jeff Patton and Peter Economy
See also: Sprint 0
| Editing: Ella Wittmann | Graphics: Emma Thoma | Contents: Erik Scheid